#5 winter 2014
OTECH MAGAZINE Michael Rosenblum
Challenging Dynamic SQL Myths Lonneke Dikmans
Lucas Jellema
Choosing the right Mobile Architecture
The Rapid Reaction Force – real time business monitoring
Foreword
New Years’ Resolutions The holiday season is upon us. This is the season to start reflecting on the past year and look towards the future. Most people start thinking of brilliant new years’ resolutions to make amends on the faults discover during their reflections. 2014 has been a wonderful year in many aspects. Globally the world economy has begun to climb out of recession and technology innovation has come in a rapid change. As is our field of work. There are more and more new technologies added to the stack of tools that are the ITdepartments’. But that also means that there are more and more technologies to choose from to solve business problems. As I see it, 2014 is a grand year of change. Finally, topics like Big Data and The Internet of Things, have shifted from tech-savvy interest2 – OTech Magazine #3 – May 2014
groups towards business driven problem solvers. Plus: (maybe the most important change in the game) privacy has become a competitive advantage for most of the highend technology companies. All major news items around the globe in the tech-industry have been around the topic of privacy. Consumers are more aware than ever. This relatively new insight of ‘data control’ also offers us a new promise: professional integrity. As we learned this past year, there’s a lot to say on this topic. Some of the larger consultancy firms in our line of business didn’t always take professional integrity as strict as they should have. And this reflects on all professionals in the field.
Therefore I would love to address all of you, let’s make 2015 the year where professional integrity becomes the standard again. Let’s make making business of IT knowledge fun and trustworthy again. It’s up to all of us. We can make this the biggest New Years’ Resolution in our career. Let’s make sure that our customers can entrust us with their information once and again! Douwe Pieter van den Bos douwepieter@otechmag.com
want your ad in otech magazine? click here to download the media kit
3 – OTech Magazine #6 – winter 2014
Content
“dear patrick”
5
DevOps and Continuous Delivery for Oracle
Patrick Barel
49
Goran Stankovski Challenging Dynamic SQL Myths
8 Choosing the right
Michael Rosenblum
Mobile Architecture How to protect your
55
Lonneke Dikmans
sensitive data using Oracle Database Vault / Creating and Testing Realms Part II
Flashback - Empowering 14
Power Users!
Anar Godjaev
Biju Thomas
12c Partitioning for
The Rapid Reaction Force –
Data Warehouses
20
real time business monitoring
Michelle Kolbe
Lucas Jellema
Defining Custom Compliance
Oracle Database In-Memory
Rules using 12c OEM Lifecycle
(Part I)
Management Pack
32
64
73
90
Mahir M Quluzade
Phillip Brown Time Series Forecasting in SQL Starting WebLogic
39
Cato Aune
4 – OTech Magazine #6 – winter 2014
Kim Berg Hansen
98
dear patrick...
Patrick Barel
Dear Lillian,
Dear Patrick, What is an ANTI-JOIN? And what is the difference between the SEMI-JOIN and the ANTI-JOIN? Lillian Sturdey
First of all, both SEMI-JOIN and ANTI-JOIN are not in the SQL syntax but they are more a pattern. You might expect to be able to write something like: [PATRICK]SQL>SELECT d.deptno, d.dname, d.loc 2 FROM dept d 3 SEMI JOIN emp e ON (e.deptno = d.deptno) 4 /
to get all the departments that have at least one employee. Or: [PATRICK]SQL>SELECT d.deptno, d.dname, d.loc 2 FROM dept d 3 ANTI JOIN emp e ON (e.deptno = d.deptno) 4 /
to get the departments with no employees. But all you get is an error saying your command is not properly ended, which can be read as a syntax error. ERROR at line 3: ORA-00933: ORA-00933 SQL command not properly ended.
Maybe your first idea would be to use a normal join to get all the departments with at least one employee:
5 – OTech Magazine #6 – winter 2014
Dear patrick 2/3
Patrick Barel
[PATRICK]SQL>SELECT d.deptno, d.dname, d.loc 2 FROM dept d 3 JOIN emp e ON (e.deptno = d.deptno)
3 JOIN emp e ON (e.deptno = d.deptno) 4 /
Exactly the result we are looking for:
4 /
But this results in a record for every row in the EMP table. And we only wanted every unique department. DEPTNO ---------20 30 30 20 30 30 10 20 10 30 20
DNAME -------------RESEARCH SALES SALES RESEARCH SALES SALES ACCOUNTING RESEARCH ACCOUNTING SALES RESEARCH
LOC ------------DALLAS CHICAGO CHICAGO DALLAS CHICAGO CHICAGO NEW YORK DALLAS NEW YORK CHICAGO DALLAS
DEPTNO ---------30 20 10
DNAME -------------SALES RESEARCH ACCOUNTING
LOC ------------CHICAGO DALLAS NEW YORK
14 rows selected.
Well, that’s easy enough, you think, just add a DISTINCT to the statement: [PATRICK]SQL>SELECT DISTINCT d.deptno, d.dname, d.loc 2 FROM dept d
6 – OTech Magazine #6 – winter 2014
DEPTNO ---------20 10 30
DNAME -------------RESEARCH ACCOUNTING SALES
LOC ------------DALLAS NEW YORK CHICAGO
But what if the EMP table contains hundreds, thousands or maybe millions of rows. That would mean the database has to do a lot of work to filter out the distinct values. A different, and probably better, approach would be to use the SEMI-JOIN pattern. You can use the IN operator like this: [PATRICK]SQL>SELECT d.deptno, d.dname, d.loc 2 FROM dept d 3 WHERE d.deptno IN (SELECT e.deptno 4 FROM emp e) 5 / DEPTNO ---------20 30 10
DNAME -------------RESEARCH SALES ACCOUNTING
LOC ------------DALLAS CHICAGO NEW YORK
This is exactly what we want to see but for big tables this is not the correct way to go. For every record in the dept table all the records in
Dear patrick 3/3
Patrick Barel
the EMP table are checked. Again, if we have a lot of employees, this means a lot of work for the database. A better SEMI-JOIN to use is the EXISTS operator:
DEPTNO DNAME LOC ---------- -------------- ------------40 OPERATIONS BOSTON
[PATRICK]SQL>SELECT d.deptno, d.dname, d.loc 2 FROM dept d 3 WHERE EXISTS (SELECT 1 4 FROM emp e 5 WHERE e.deptno = d.deptno) 6 /
Hope this gives you a bit more insight in this subject and gives you a better understanding of the wonders of the SQL language. Notice there are many ways to reach the same result, but one approach might be more economical than the other.
Please note that with the current optimizer in the database Oracle will rewrite your query to use the best approach for the task. If the inner table (in our example EMP) is rather small, then the IN approach might be the best, in other cases it might be better to use the EXISTS approach. Where in earlier versions you had to think about which way to go (IN is better for small tables, EXISTS is better for big ones), you can now rely on the optimizer to make the correct choice. If you would want to see exactly the opposite of this query, i.e. all departments with no employees, you use an ANTI-JOIN pattern, which is pretty much the same but in this case you use NOT IN or NOT EXISTS. A different approach, which I think is pretty nice is to use an OUTER JOIN and check for the non-existence of values in column for the OUTER JOINED table. [PATRICK]SQL>SELECT 2 FROM 3 LEFT 4 WHERE 5 /
d.deptno, d.dname, d.loc dept d OUTER JOIN emp e ON (e.deptno = d.deptno) e.empno IS NULL
7 – OTech Magazine #6 – winter 2014
Happy Oracle’ing, Patrick Barel
If you have any comments on this subject or you have a question you want answered, please send an email to patrick@otechmag.com. If I know the answer, or can find it for you, maybe I can help.
Michael Rosenblum www.dulcian.com
Challenging Dynamic SQL Myths
8 – OTech Magazine #6 – winter 2014
Challenging dynamic SQL myths 2/6 Michael Rosenblum For all of the years that Dynamic SQL was a part of the Oracle technology stack, it developed its own mythology. The reason for these myths is the magnitude of changes that Oracle has applied to Dynamic SQL over the last decade. The problem is that some of these myths are simply wrong and they have always been wrong. Dynamic SQL needs to be used with care but most “conventional wisdom” about dynamic SQL is in full, or in part, incorrect. It is fair to say that limitations existed in previous versions of Dynamic SQL. Many of these limitations have been removed or reduced in the current release of the Oracle DBMS. As a result, most problems that you hear about should always be met with the question: “In which version did you experience this issue?” Unfortunately, people often listen to the detractors and thereby fail to use Dynamic SQL where it could save them a great deal of time and effort. This section challenges the most common myths and provides evidence to show why they are wrong.
Myth #1: Dynamic SQL is Always a Security Risk When using Dynamic SQL, it is very important to keep in mind the fact that the dreaded code injection can happen only if you do not follow security best practices. It is not difficult to keep your code safe by adhering to the following guidelines:
9 – OTech Magazine #6 – winter 2014
1. Always use bind variables to pass data inside of dynamically constructed code modules. Never concatenate. Since bind variables are processed after the code is parsed, unexpected values (such as ‘ OR 1=1’) will not be able to impact what is being executed. 2. The datatypes of these variables should match the datatypes of the defined parameters. This eliminates the risk of somebody manipulating NLS-settings and sneaking malicious code into an implicit datatype conversion mechanism. 3. If you need to concatenate any structural elements (columns, tables, and so on), always use the DBMS_ASSERT package. This package has a number of APIs to ensure that input values are what they should be. For example: a. SQL_OBJECT_NAME (string) – checks whether the string is a valid object. b. SIMPLE _SQL_NAME(string) – checks whether the string is a valid simple SQL name. c. SCHEMA_NAME(string) – validates that the passed string is a valid schema. It is also important to remember that technologies similar to Dynamic SQL exist in other environments, and not only in PL/SQL. For example, it is reasonably common for middle-tier developers to build their SQL statements on the fly and even open that functionality to end-users using ad-hoc query tools. You need to be aware that providing users with this capability gives them far more power than you might expect and opens
Challenging dynamic SQL myths 3/6 Michael Rosenblum new holes faster than you can programmatically close. Any environment that allows end-users to enter real code (even in the form of customized WHERE-clauses) should be considered a major security threat, no matter how carefully it is constructed.
can be somewhat resolved by using the DYNAMIC_SAMPLING hint, it is still valid to say that since Oracle samples the data set, there is still a probability of a mistake. Since you want to be 100% certain that Oracle uses the correct cardinality, you can convert the entire module to Dynamic SQL and adjust the CARDINALITY hint on the spot:
Myth #2: Dynamic SQL is Always Slower than Regular SQL The biggest misconception that prevents people from using Dynamic SQL is the idea that it will always be slower than regular SQL. Nobody questions the fact that if you compare the execution of a plain query with exactly the same query wrapped with EXECUTE IMMEDIATE, you will see some performance degradation. However, this is not a fair test. Instead of comparing syntactically equivalent cases, you should be comparing functionally equivalent ones. If your tests are properly set up, you will find that, in some cases, Dynamic SQL provides a very viable alternative and can sometimes provide better performance than traditionally written code.
SQL> 2 3 4 5 6 7 8 9 10 11 12 SQL>
CREATE OR REPLACE PROCEDURE p_processObjects (i_tt id_tt) IS v_sql_tx VARCHAR2(32767); v_out_dt DATE; BEGIN v_sql_tx:=’SELECT min(created) inlist_tab’|| ‘ WHERE object_id IN ‘|| ‘(SELECT /*+ cardinality(t ‘||i_tt.count||’)*/ column_value ‘|| ‘ FROM table(:1) t)’; EXECUTE IMMEDIATE v_sql_tx INTO v_out_dt USING i_tt; dbms_output.put_line(‘MinDate:’||to_char(v_out_dt,’YYYYMMDD’)); END; / exec p_processObjects (id_tt(106,25,543));
MinDate:20111103
The main advantage of Dynamic SQL is that you can fine-tune your code at runtime. By adding a small extra layer of complexity, you can utilize the knowledge that was not available at compilation time. This extra knowledge is what can make Dynamic SQL-based solutions more efficient than their hard-coded counterparts. In general, Dynamic SQL thrives on the unknown. The less information that is available now, the more valuable its usage. For example, even though the classic problem of object IN-list default cardinality of 8168
10 – OTech Magazine #6 – winter 2014
Myth #3: Dynamic SQL Always Causes Parsing One of the most widespread myths about Dynamic SQL is that any EXECUTE IMMEDIATE call causes parsing, even when using bind variables. This myth is very dangerous because people might think that if you cannot avoid parsing, why bother about properly configured binds? The reality was true in the past. Up to and including Oracle 10g, if you executed N statements with bind variables, you would get 1 hard parse and N-1 soft parses. Starting from version 11g onwards, Oracle is efficient
Challenging dynamic SQL myths 4/6 Michael Rosenblum enough to completely skip soft parses. The proof can be demonstrated by running the following script and checking the trace: SQL> SQL> SQL> 2 3 4 5 6 7 8
CREATE TABLE dynamic_sql_q1 (a NUMBER); CREATE TABLE dynamic_sql_q2 (a NUMBER); begin dbms_monitor.session_trace_enable(waits=>true, binds=>true); FOR i IN 1..50 LOOP EXECUTE IMMEDIATE ‘INSERT INTO dynamic_sql_q1(a) VALUES (:1)’ USING i; EXECUTE IMMEDIATE ‘INSERT INTO dynamic_sql_q2(a) VALUES (:1)’ USING i; END LOOP; dbms_monitor.session_trace_disable; END;
9 /
Note that both DYNAMIC_SQL_Q1 and DYNAMIC_SQL_Q2 tables are brand new and there could not be any cached information available to Oracle. This is visible from the trace file aggregated using the TKPROF utility: ... SQL ID: 63rua82qvjn8j Plan Hash: 0 INSERT INTO dynamic_sql_q1(a) VALUES (:1) call count cpu elapsed disk query current rows ------- ------ -------- ---------- -------- ---------- ---------- ---------Parse 1 0.00 0.00 0 0 0 0 Execute 50 0.00 0.01 0 4 94 50 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- -------- ---------- ---------- ---------total 51 0.00 0.01 0 4 94 50 Misses in library cache during parse: 1 Misses in library cache during execute: 1 Optimizer mode: ALL_ROWS Parsing user id: 83 (recursive depth: 1)
11 – OTech Magazine #6 – winter 2014
Number of plan statistics captured: 1 ******************************************************************************** SQL ID: 1x6bn2s5p36gf Plan Hash: 0 INSERT INTO dynamic_sql_q2(a) VALUES (:1) call count cpu elapsed disk query current rows ------- ------ -------- ---------- -------- ---------- ---------- ---------Parse 1 0.00 0.00 0 0 0 0 Execute 50 0.03 0.01 0 4 99 50 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- -------- ---------- ---------- ---------total 51 0.03 0.01 0 4 99 50 Misses in library cache during parse: 1 Misses in library cache during execute: 1 Optimizer mode: ALL_ROWS Parsing user id: 83 (recursive depth: 1) Number of plan statistics captured: 1
Obviously, both queries were parsed only once. Exactly the same optimization is applicable to a FORALL statement in addition to firing all INSERTs as a single roundtrip: SQL> SQL> 2 3 4 5 6 7 8 9 10 11 12 13
CREATE TABLE dynamic_sql_q3 (a NUMBER); DECLARE v_tt dbms_sql.Number_Table; BEGIN FOR i IN 1..50 LOOP v_tt(i):=i; END LOOP; dbms_monitor.session_trace_enable(waits=>true, binds=>true); FORALL i IN INDICES OF v_tt EXECUTE IMMEDIATE ‘insert into dynamic_sql_q3(a) values (:1)’ USING v_tt(i); dbms_monitor.session_trace_disable; END; /
Challenging dynamic SQL myths 5/6 Michael Rosenblum <<<<<<<<<<<<<< Extract from the TKRPROF report >>>>>>>>>>>>>>>>>> SQL ID: 7uawqxwvd81jc Plan Hash: 0 INSERT INTO dynamic_sql_q3(a) VALUES (:1) call count cpu elapsed disk query current rows ------- ------ -------- ---------- -------- ---------- ---------- ---------Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.01 0.01 0 4 45 50 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- -------- ---------- ---------- ---------total 2 0.01 0.01 0 4 45 50 Misses in library cache during parse: 1 Misses in library cache during execute: 1 Optimizer mode: ALL_ROWS Parsing user id: 83 (recursive depth: 1) Number of plan statistics captured: 1
The report above shows perfect results. For 50 processed rows, there is only one PARSE step and one EXECUTE step. This is solid proof that Dynamic SQL is no longer guilty of expending too many additional resources while preparing statements on the fly.
Myth #4: DDL Statements are Only for DBAs Too many DBAs insist that no DDL commands should ever be fired in a production environment. Therefore, they often try to prevent DDLs from happening by any means, whether technical or organizational. In addition to very strict internal policies, the authors have encountered databaselevel triggers that limit the entire system to basic DML operations. Of course, tightening security is not such a bad idea, but it should not stop you from providing the required functionality. Because of Dynamic SQL, DDL statements are no longer the exclusive prerogative of DBA’s SQL*Plus
12 – OTech Magazine #6 – winter 2014
scripts. They can be directly integrated into PL/SQL code and become a critical part of the overall solution. Although, creating real database objects on the fly in many formally regulated environments can indeed “cross the line,” there are legitimate cases that will still pass the strictest checks. One such case comes from my experience maintaining a multi-tier system where the communication between the middle tier and the database was done using a connection pool. Eventually, the following problem was observed. At the end of the day, each session from the pool locked a significant number of segments from the TEMP tablespace. The reason was that one part of the system used global temporary tables (GTT), defined as “ON COMMIT PRESERVE.” Once they acquire a segment, these tables do not release it even if you delete all rows. Obviously, the only way to reset the high-water mark is to use the TRUNCATE command. However, TRUNCATE is a DDL statement and cannot be used in straight PL/SQL! That’s where Dynamic SQL comes to the rescue. The following module resets all GTTs in the defined session if any one of them was touched: PROCEDURE p_truncate IS v_exist_yn VARCHAR2(1); BEGIN SELECT ‘Y’ INTO v_exist_yn FROM v$session s, v$tempseg_usage u WHERE s.audsid = SYS_CONTEXT(‘USERENV’,’SESSIONID’) AND s.saddr = u.session_addr AND u.segtype = ‘DATA’ AND rownum = 1;
Challenging dynamic SQL myths 6/6 Michael Rosenblum FOR c IN (SELECT table_name FROM user_tables WHERE TEMPORARY = ‘Y’ and DURATION = ‘SYS$SESSION’) LOOP EXECUTE IMMEDIATE ‘truncate table ‘||c.table_name; END LOOP; END;
Unfortunately, there is no simple way to find out exactly which table was touched, so TRUNCATE is fired for all of them. Therefore, if the total number of GTTs is reasonably small, the overhead can be tolerated. TIP: You should never forget that all DDL statements (even unsuccessful ones) always fire COMMIT before doing anything else. This implicit COMMIT should be taken into consideration if you add DDLs using Dynamic SQL to your procedural logic. Depending upon the requirements, you may consider using autonomous transactions to protect uncommitted changes.
Summary The biggest challenge in learning Dynamic SQL is to get past your initial fear of this feature. Of course, with any advanced technology comes risk with its misuse. In the case of Dynamic SQL, the chances of encountering security or performance issues often outweigh the potential benefits.
13 – OTech Magazine #6 – winter 2014
Anar Godjaev www.yapikredi.com.tr ACE Associate
Blog: anargodjaev.wordpress.com/
How to protect your sensitive data using Oracle Database Vault / Creating and Testing Realms Part II
14 â&#x20AC;&#x201C; OTech Magazine #6 â&#x20AC;&#x201C; winter 2014
how to protect your sensitive data using oracle database vailt / creating and testing realms part II 2/6 Anar Godjaev Any measures that should be taken for data security purposes should also be considered at the database level, similar to hardware, network and operation system levels. Generally, companies buy a firewall product and think that they have already solved the problems related to security. Researches show that despite it is possible to take measures against external attacks by the firewall products, no sufficient measures may be taken against internal attacks. In particular, no action related to protection of the data is executed on the server where the database operates. Taking into account that a user having DBA authority will have all type authority at the database and may perform the same operations even when s/he gets connected from other computers, possible security gaps should be considered. Access of a database admin to every data is as disadvantageous as him/her to connect from other computers and perform the same operations, and is even a security gap. Oracle Database Vault, which is one of the security solutions of Oracle Database, may be recommended as an application which may assist in solving of the abovementioned problems. With the increased sophistication of attacks on data, the need to put more operational controls on the database is greater than ever. Given the fact that most customers have a small number of DBAs to manage their databases, it is very important to keep database security related tasks separate in their own dedicated database accounts. Creating dedicated database accounts to manage database security helps customers prevent privileged DBA accounts from accessing application data, restricts ad hoc
15 – OTech Magazine #6 – winter 2014
database changes, and enforces controls over how, when and where application data can be accessed. Oracle Database Vault provides security benefits to customers even when they have a single DBA by: • Preventing hackers from using privileged users’ accounts to steal application data • Protecting database structures from unauthorized and/or harmful changes • Enforcing controls over how, when and where application data can be accessed • Securing existing database environments transparently and without any application changes Among the more common audit findings are unauthorized changes to database entitlements, including grants of the DBA role, as well as new accounts and database objects. Preventing unauthorized changes to production environments is important not only for security, but also for compliance as such changes can weaken security and open doors to hackers, violating privacy and compliance regulations. Oracle Database Vault SQL Command Controls allow customers to control operations inside the database, including commands such as create table, truncate table, and create user. Various out-of-the-box factors such as IP address, authentication method, and program name help implement multi-factor authorization to deter attacks leveraging stolen passwords. These controls prevent accidental configuration changes and also prevent hackers and malicious insiders from tampering with applications. The Duty Separation feature
how to protect your sensitive data using oracle database vailt / creating and testing realms part II 3/6 Anar Godjaev of Oracle Database Vault will create three different responsibilities such as the security administration on the database, the account management and the database administration. The Security Administrator (Security Administration), the responsible person for Security is also the manager of the Oracle Database Vault. S/he is responsible for all security operations at the database. S/he may manage Realms, command rules and factors and may operate Database Vault report, while s/he may not get access to the application data. The Account Manager (Account Management) may create, delete and change user accounts. And the Database Administrator (Database Administration) has dba functions such as backup/restoration, patch application and performance management.
16 â&#x20AC;&#x201C; OTech Magazine #6 â&#x20AC;&#x201C; winter 2014
Oracle customers today still have hundreds and even thousands of databases distributed throughout the enterprise and around the world. However, database consolidation will continue as a cost-saving strategy in the coming years. The physical security provided by the distributed database architecture must be available in the consolidated environment. Oracle Database Vault addresses the primary security concerns of database consolidation.
how to protect your sensitive data using oracle database vailt / creating and testing realms part II 4/6 Anar Godjaev First, it’s important to understand the basic architecture of the application you wish to protect. For example, are the objects associated with the application spread across multiple database schemas or are they contained in a single database schema? This analysis should include all objects related to application data including tables, views, materialized views, and stored procedures. Identify the programs, processes, middle tier connections, database users, and application administrators that interact with the application objects. Once this information is obtained, the Oracle Database Realm definitions can be created and you can authorize who should be able to access application data. Application end users typically access application data through the middle tier. Some legacy applications may still use the client server architecture where end users have their own account in the database. More sophisticated applications may have application specific processes that run on the server hosting the Oracle Database. Along this review document, we used the database Vault Administrator (DVA) console to administrate Oracle Database Vault. Using DVA, we created an HR Data Realm to protect human resources data. In order to set up this realm with DVA we should firstly click Realms, then click Create, and then naming and defining the realm HR Data Realm.
17 – OTech Magazine #6 – winter 2014
During the setup procedure, one of the main objective was to ensure that the users with the high privileges was not able to access HR data but could still administer the database containing the HR Data Realm. . Once the realm was named and enabled, we selected Audit on failure in order to send a notification if rules are violated.These are referred to as Realm Secured Objects. For each object in realm owner, object type and name need to be specified. In this case, we used the wildcard (%) option to identify all objects owned by the HR user.
how to protect your sensitive data using oracle database vailt / creating and testing realms part II 5/6 Anar Godjaev We also run queries as the HR user to define what owner of the object could do when a Secured Realm existed for the object they owned. And also we ensure that no specific privileges had been granted within Oracle Database Vault to HR at this point. By default, the owner of the object could only be apply Data Manipulation language (DML) queries. Data Definition Language (DDL) could not be issued at this point. In this point of setup procedure, the next step was to determine controls of privileged user, such as System, when the user accesses objects in the realm. In this case, the goal is whenever System user or other privileged user attempt to query HR object result in message should be like System had insufficient privileges or this object. Similarly, SYSTEM could not be able to create objects in the HR Data Realm, and Oracle Database Vault returned a violation notification.
Some employees will need authorization to modify the database as business needs dictate. After running the test above, the user, HR, was added to HR Data Realm using realm authorizations.
18 â&#x20AC;&#x201C; OTech Magazine #6 â&#x20AC;&#x201C; winter 2014
how to protect your sensitive data using oracle database vailt / creating and testing realms part II 6/6 Anar Godjaev Thus privileged database accounts are one of the most commonly used pathways for gaining access to sensitive applications data in the database. While their broad and unrestricted access facilitates database maintenance, the same access also creates a point of attack for gaining access to large amounts of data. Oracle Database Vault Realms around application schemas, sensitive tables and stored procedures provide controls to prevent privileged accounts from being exploited by hackers and insiders to access sensitive application data. Once authorized, this user could issue any statement chosen, including DDL and DML statements.
19 – OTech Magazine #6 – winter 2014
Source : Oracle® Database Vault Administrator’s Guide 11g Release 2 (11.2)
Michelle Kolbe www.backcountry.com
12c Partitioning for Data Warehouses
20 – OTech Magazine #6 – winter 2014
12c partitioning for data warehouses 2/11 Michelle Kolbe A feature of the Oracle database since Oracle 8.0 in 1997, Oracle Partitioning enables large tables and indexes to be split into smaller pieces to improve performance, availability, and manageability. Queries are sped up by orders of magnitudes and a key feature for data warehouse environment implementations. These smaller pieces that the table is split into are called partitions. Each partition has its own name and optionally its own storage characteristics. Partitions can be used to spread out and balance IO across different storage devices. The DBA is able to manage the partitioned table as a whole or as individual pieces. The cost based optimizer can use partition pruning to read only the blocks from specific partitions into memory based on the where clause filters in the query.
Partitioning Strategies Using a partitioning key, a set of columns used to determine which partition a row will exist in, tables and indexes can be partitioned using one or many of multiple strategies. • Range: Data is split based on a range of values of the partitioning key, most commonly a date field. Partitions are defined by their upper limit with the lower limit being defined by the upper limit of the preceding partition. The last partition can optionally be open-ended with no limit to avoid errors when inserting new data. • List: A discrete list of values is used for distributing the data for example, a list of states may be used for the partitioning key. A ‘DEFAULT’ partition is used to catch any values that may not fall into one of the defined list values.
21 – OTech Magazine #6 – winter 2014
• Hash: Data is divided into equal balanced partitions using an internal hash algorithm. • Interval: This is a new feature in 11g. With this strategy, the Oracle database will automatically create new partitions when new data is inserted. As an example, on a table that is partitioned by year, when data for 2015 is inserted, a new partition for the year 2015 will be created. This type of table makes the manageability of partitions much easier. • Reference: Leverages a parent-child relationship through the primary key – foreign key relationship so that the child table inherits the parent table’s partitioning strategy. For every partition in the parent table, there is a partition in the child table. Later in this paper, there will be an example of Reference partitioning. • Virtual Column-Based Partitioning: A partitioning key defined by an expression is used to distribute the data into partitions. This expression, or virtual column, is stored as metadata only. • System: Similar to Hash partitioning except does not require a key column to be specified as the hash key column. Composite partitioning is a combination of two of the partitioning strategies Range, List and Hash. A couple of examples of using this strategy would be an Order table that is range partitioned by order_date and sub-partitioned by hash on customer_id or sub-partitioned by range on ship_date.
12c partitioning for data warehouses 3/11 Michelle Kolbe New Features Interval-Reference Partitioning New in 12c is the ability to composite partition first by interval then reference. With this method, new partitions will be created when the data arrives and the child tables will be automatically maintained with the new data. The partition names will be inherited from already existing partitions.
An example
In 12c the above script gives this result: table ORDERS created. table ORDERLINES created.
Let’s check what partitions we have. There is one partition in each table. select table_name, partition_name, high_value, interval from user_tab_partitions where lower(table_name) in (‘orders’, ‘orderlines’);
If I try to create the following two tables in Oracle 11g, the first statement will succeed but the second will give me an error. create table orders ( order_number number, order_date_id number, constraint orders_pk primary key(order_number) ) partition by range(order_date_id) INTERVAL(7) ( partition p1 values less than (20140101) ); table ORDERS created. create table orderlines ( orderline_id number, order_number number not null, constraint orderlines_pk primary key(orderline_id), constraint orderlines_fk foreign key (order_number) references orders ) partition by reference(orderlines_fk); ORA-14659: Partitioning method of the parent table is not supported
22 – OTech Magazine #6 – winter 2014
Now let’s insert some data in the the orders table and check the partitions. insert into orders values (1, 20131231); insert into orders values (2, 20140102); insert into orders values (3, 20140113); commit; 1 rows inserted. 1 rows inserted. 1 rows inserted. committed. where lower(table_name) in (‘orders’, ‘orderlines’);
12c partitioning for data warehouses 4/11 Michelle Kolbe We can see that the orders table has 3 partitions now and 2 were created as INTERVAL partitions. Let’s add data to the orderlines table now and check the partitions. insert into orderlines values (1, 2); commit; 1 rows inserted. committed.
Indexing
The child table now has one INTERVAL partition that is named the same as the parent table. Oracle gives us the functionality to manually split partitions into smaller subsets. If we do this on an interval partition, then some of the partitions will be converted to conventional partitions instead of interval partitions. In 12c, this split will also convert the child partitions to conventional partitions. alter table orders split partition for (20140104) at (20140104) into (partition p20140101, partition p20140104); table ORDERS altered. select table_name, partition_name, high_value, interval from user_tab_partitions where lower(table_name) in (‘orders’, ‘orderlines’);
23 – OTech Magazine #6 – winter 2014
Regardless of the partitioning strategy, three types of indexes are available for use. • Global Non-Partitioned Index: Identical to an index on a non-partitioned table. This type of index is typically used on a primary key field in a data warehousing environment. • Global Partitioned Index: Index on a partitioned or non-partitioned table that has a partitioning strategy different than the table. This is not common in a data warehousing environment because the index maintenance is more difficult. • Local Index: Index coupled with the underlying partitioned table. Each index corresponds only one partition. These are most common in data warehousing environments because it optimizes partition maintenance in that when one partition is dropped, the index partition is also dropped.
Partial Index Oracle 12c has a new feature called a Partial Index which allows the user
12c partitioning for data warehouses 5/11 Michelle Kolbe to create indexes that only span certain partitions, not all. This feature works on Local and Global indexes and can complement the full indexing strategy. The policy can be overwritten at any time. To explain this concept, I think that the Oracle chart below describes the situation the best.
create table orders ( order_number number , col2 number , col3 number , col4 number ) indexing off partition by range(order_number) ( partition p1 values less than (100) indexing on, partition p2 values less than (200) indexing on, partition p3 values less than (300) indexing on, partition p4 values less than (400) indexing on, partition p5 values less than (500) indexing on, partition p_max values less than (MAXVALUE) indexing off ); table ORDERS created.
Let’s query user_tab_partitions to see the partition definitions. select table_name, partition_name, high_value, indexing from user_tab_partitions where table_name = ‘ORDERS’;
As you can see from this chart’s example, the table was defined with 3 of the partitions having partial local and global indexes. The last partition does not have any indexes. Here’s how this is implemented: First, create a partitioned table with INDEXING OFF set on the whole table definition but then INDEXING ON for each individual partition that you want indexes on.
24 – OTech Magazine #6 – winter 2014
12c partitioning for data warehouses 6/11 Michelle Kolbe Now create two LOCAL indexes; the first one is a partial index.
Now let’s create two GLOBAL indexes, the first one being a partial index.
create index orders_idx1 on orders(order_number) local indexing partial; create index orders_idx2 on orders(col2) local; index ORDERS_IDX1 created. index ORDERS_IDX2 created.
create index orders_idx3 on orders(col3) indexing partial; create index orders_idx4 on orders(col4); index ORDERS_IDX3 created. index ORDERS_IDX4 created.
And let’s check out how these are defined in the Index Partitions table.
And now let’s query the indexes table for these indexes.
select index_name, partition_name, status from user_ind_partitions where index_name in (‘ORDERS_IDX1’, ‘ORDERS_IDX2’);
select index_name, status, indexing from user_indexes where index_name in (‘ORDERS_IDX3’, ‘ORDERS_IDX4’);
Both indexes are VALID and the first one is defined as Partial. If we look at the number of segments created for these indexes, we can see that the local indexes have differing amounts of segments. select segment_name, segment_type, count(*) from user_segments where segment_name in (‘ORDERS_IDX1’, ‘ORDERS_IDX2’, ‘ORDERS_IDX3’, ‘ORDERS_IDX4’) group by segment_name, segment_type order by 1;
As you can see from the results, an index partition was created on the partial index for P_MAX even though we specified INDEXING OFF however, it’s marked as UNUSABLE.
25 – OTech Magazine #6 – winter 2014
12c partitioning for data warehouses 7/11 Michelle Kolbe So by now you are probably asking, what does this mean when someone queries the whole dataset? Let’s look at the explain plan for a query against the global partial index: orders_idx3. explain plan for select count(*) from orders where col3 = 3; select * from table(dbms_xplan.display);
Adding or Dropping Multiple New Partitions Prior to the 12c release, only one new partition could be added a time and multiple new partitions required multiple alter table statements. This was the same with dropping partitions. In 12c, this feature is now available. ALTER TABLE PARTITION PARTITION PARTITION
orders_range_part 2014 VALUES LESS 2015 VALUES LESS 2016 VALUES LESS
ADD THAN to_date(‘01-01-2015’, ‘MM-DD-YYYY’), THAN to_date(‘01-01-2016’, ‘MM-DD-YYYY’), THAN to_date(‘01-01-2017’, ‘MM-DD-YYYY’);
Splitting or Merging Partitions Previously, the operation of merging two partitions could only be done on two partitions into one at a time. In 12c, we can now merge multiple partitions into one all in one statement. ALTER TABLE orders_range_part MERGE PARTITIONS year_2010, year_2011, year_2012, year_2013 INTO PARTITION historical_data_partition;
Notice that the index is used and also the orders table is queried with TABLE ACCESS FULL for the one partition without the index.
Partition Maintenance The syntax ALTER TABLE can be used for maintaining partitioned tables similar to non-partitioned tables. 12c has added functionality to make this table maintenance easier.
26 – OTech Magazine #6 – winter 2014
12c also added functionality to split a partition into multiple partitions versus only into two partitions in previous database releases. For instance, if we had a partition for a whole year but we wanted to split it into quarters, this previously would have taken 3 alter table statements. Now it can be done in one statement. ALTER TABLE orders_range_part SPLIT PARTITION year_2013 INTO (year_2013_q1 VALUES LESS THAN to_date(‘04-01-2013’, ‘MM-DD-YYYY’), year_2013_q2 VALUES LESS THAN to_date(‘07-01-2013’, ‘MM-DD-YYYY’), year_2013_q3 VALUES LESS THAN to_date(‘10-01-2013’, ‘MM-DD-YYYY’), year_2013_q4);
12c partitioning for data warehouses 8/11 Michelle Kolbe The specification of the partitions can also be done as a range or as a list of values in that partition. ALTER TABLE orders_range_part MERGE PARTITIONS year_2010 to year_2013 INTO PARTITION historical_data_partition; ALTER TABLE orders_range_part MERGE PARTITIONS for (to_date(‘01-01-2010’, ‘MM-DD-YYYY’)), for (to_date(‘01-01-2011’, ‘MM-DD-YYYY’)), for (to_date(‘01-01-2012’, ‘MM-DD-YYYY’)), for (to_date(‘01-01-2013’, ‘MM-DD-YYYY’)), INTO PARTITION historical_data_partition;
ALTER TABLE orders TRUNCATE PARTITION 2011_q1 CASCADE;
Similar to the TRUNCATE functionality, EXCHANGE PARTITION also now be completed in 12c in one operation instead of a bottom-up truncate then top-down insert. The reference-partitioned hierarchy structure must match for the target table and the new table being exchanged.
Cascading Truncate or Exchange Partition In Reference partitioning, performing maintenance can be time consuming to make sure the integrity of the entire hierarchy is maintained. New with 12c, is a cascading function for TRUNCATE or EXCHANGE PARTITION. CASCADE applies to the whole reference tree so that it happens in a single, atomic transaction, all or nothing. This functionality is available for non-partitioned tables also. To make this work, ON DELETE CASCADE for all foreign keys is required.
Figure 7 - 11g Truncate / 12c Truncate
27 – OTech Magazine #6 – winter 2014
Figure 8 - 11g Exchange / 12c Exchange
Asynchronous Global Index Maintenance Every TRUNCATE or DROP command automatically triggers asynchronous global index maintenance, meaning that there’s no waiting for a global index maintenance to take place. What this means is that if a partition is truncated, the database will keep track of what records have been orphaned in the index and keep the index as usable. It will filter these out when queries are run against the index. The DROP and TRUNCATE operations are fast, metadata-only operations which use minimal resources. These orphaned records will be synchronized automatically through the Oracle job SYS.PMO_DEFERRED_GIDX_MAINT_JOB which is scheduled
12c partitioning for data warehouses 9/11 Michelle Kolbe to run daily at 2 am by default. The index can also be manually cleaned up by running the above job, running ALTER INDEX REBUILD [PARTITION] or ALTER INDEX [PARTITION] COALESCE CLEANUP.
An Example Create a range partitioned table with 500 records in 5 partitions. create table orders ( order_number number ) partition by range(order_number) ( partition p1 values less than (100), partition p2 values less than (200), partition p3 values less than (300), partition p4 values less than (400), partition p5 values less than (500), partition p_max values less than (MAXVALUE) ); table ORDERS created. insert /*+ APPEND*/ into orders select level from dual connect by level < 501; commit; 500 rows inserted. committed. select count(*) from orders; COUNT(*) ---------500
28 – OTech Magazine #6 – winter 2014
Now create an index on this table. When the index is first created, it will not have any orphaned records. create index orders_idx on orders(order_number); index ORDERS_IDX created. select index_name, orphaned_entries from user_indexes where index_name = ‘ORDERS_IDX’;
INDEX_NAME ORPHANED_ENTRIES ---------------------------------------------------------------------------------------------------------------------ORDERS_IDX NO Now we are going to truncate the partition. This statement runs super fast and the index is still valid. alter table orders truncate partition p1 update indexes; table ORDERS altered. select index_name, status, orphaned_entries from user_indexes where index_name = ‘ORDERS_IDX’;
INDEX_NAME STATUS ORPHANED_ENTRIES ---------------------------------------------------------------------------------------------------------------------ORDERS_IDX VALID YES The index has orphans which can either be cleaned manually or cleaned automatically with the SYS.PMO_DEFERRED_GIDX_MAINT_JOB that runs
12c partitioning for data warehouses 10/11 Michelle Kolbe by default at 2 AM daily. Let’s manually clean it now. exec dbms_part.cleanup_gidx(); anonymous block completed select index_name, status, orphaned_entries from user_indexes where index_name = ‘ORDERS_IDX’;
INDEX_NAME STATUS ORPHANED_ENTRIES ----------------------------------------------------------------------------------ORDERS_IDX VALID NO Note that the cleanup_gidx procedure has some optional parameters. Without the parameters, it runs on the entire database cleaning up the indexes. If we include the schema name, it will run on the schema level and if we give it a schema and a table name, it will run for only the indexes on that table. An ALTER INDEX <index name> REBUILD PARTITION <partition name> or ALTER INDEX <index name> COALESCE CLEANUP will also cleanup orphaned rows.
Online Partition Move Data warehouses are facing even more pressure to be available 24/7. In 11g and prior versions, running certain maintenance commands would cause DML commands to be locked until the maintenance completed.
29 – OTech Magazine #6 – winter 2014
This locking often puts a strain on the data warehouse environment. The new MOVE PARTITION ONLINE functionality in 12c helps meet this requirement to be available for data manipulation while doing partition maintenance. ALTER TABLE MOVE is used to move partitions with older data to slower, cheaper storage devices since it is not being frequently accessed. By adding ONLINE to this statement, the table will not be locked for DML operations while the move is taking place. This can also be used for compressing a table as shown in the code examples below. ALTER TABLE orders MOVE PARTITION year2010 TABLESPACE old_storage UPDATE INDEXES ONLINE; ALTER TABLE orders MOVE PARTITION year2010 COMPRESS UPDATE INDEXES ONLINE;
There are a couple of best practices to note when using Online Partition Move. Compression while DML is being performed is that the DML will have an impact on compression efficiency. The best compression ratio will come with the initial bulk move. Also, you should seek to reduce the number of concurrent DML operations that occur while doing a partition move because it requires additional disk space and resources for journaling.
Reference Partitioning When the Oracle 11g release included a new feature called Reference
12c partitioning for data warehouses 11/11 Michelle Kolbe Partitioning, it was a bit limited in its scope of operations. With 12c, three new features have been added to help with creation and maintenance of this partitioning scheme. TRUNCATE CASCADE is now an option on the parent table that will automatically truncate the child partitions also. In 11g to achieve this functionality, separate TRUNCATE statements would need to be run on the parent and the child tables. Similarly to the TRUNCATE, CASCADE has been added to the partition EXCHANGE feature. This will modify the child table when modifications are made to the parent table. Lastly, reference partitioning can now be used on parent tables that are interval partitioned as mentioned above in Partitioning Strategies.
A few tips If you want to change an existing partitioned table to interval partitioning, you can execute this command: ALTER TABLE <table name> SET INTERVAL (numtoyminterval(1,’MONTH’));
With interval partitioning, to change the tablespace that new partitions are stored in use: ALTER TABLE <table name> SET STORE IN (<tablespace name>);
30 – OTech Magazine #6 – winter 2014
References Oracle Database 12c: What’s New in Partitioning? Tutorial. http://www.oracle.com/webfolder/technetwork/tutorials/obe/db/12c/ Partitioning/12c_parti.html Oracle Database 12c: Partitioning Improvements with Tom Kyte Video. https://www.youtube.com/watch?v=W71G0H91n_k
Â
Â
Â
Â
31 â&#x20AC;&#x201C; OTech Magazine #6 â&#x20AC;&#x201C; winter 2014
Phillip Brown www.e-dba.com
Defining Custom Compliance Rules using 12c OEM Lifecycle Management Pack
32 – OTech Magazine #6 – winter 2014
defining custiom compliance rules using 12c oem lifecycle management pack 2/7 Phillip Brown Security should be a key area of concern for any DBA and there are many aspect to it. The starting point is a database security standard. Here you will define different aspects of the database security, be it access controls, data management, hardening parameters and determine how various compliance checks will be made. A compliance check could be anything from user access to certain DBA_ views or if a particular privileged account which should be locked. To enable the evaluation of these compliance checks you can use the Oracle Lifecycle Management Pack. This pack gives you the functionality to be able to define your own security standard within 12c OEM but more importantly the ability to create custom compliance rules. However, this can be time-consuming and complicated. This article looks to demonstrate that with some technical guidance, planning and consideration defining your own custom compliance rules in 12c OEM can be as easy as locking the SYSTEM account.
The key to deploying custom compliance rules within 12c are the three amber sections, Compliance Rules, Configuration Extensions and Database SQL. Within the LCMP all compliance rules will go to define a compliance standard, but this effectively is just grouping your rules together that you have defined. You can also define a compliance standard (collection of rules) for different target types, be it host, database, application server etc. Ultimately all you compliance standards go to make a compliance framework.
The Lifecycle Management Pack in OEM has been around for some time now and it covers a wealth of functionality. It’s formation started at around 10g but at that time the functionality was split across a number of management packs and didn’t go by the LCMP name. From 12c the packs (Configuration Pack, Change Management, Provisioning) were amalgamated into the Lifecycle Management Pack and the functionality was refined and significantly improved. The Lifecycle Management Pack can take you from bare-metal provisioning to data change management within a schema, however the functionality I have been working with almost exclusively for 12 months now is the area of security and compliance.
The compliance rules are as follow:
33 – OTech Magazine #6 – winter 2014
I am going to create a compliance standard in 12c OEM with three compliance rules.
• The SYSTEM account must be locked • ‘DROP ANY TABLE’ must be audited • SYSOPER accounts should be used for segregation of duties The next section will demonstrate some of the key issues when defining SQL for your compliance rules.
defining custiom compliance rules using 12c oem lifecycle management pack 3/7 Phillip Brown
So this is a quick first attempt at getting the data into 12c OEM from our targets, it looks ok, but there are some key problems. If we select a target and have a look at the preview data brought back I can show you a couple of issues.
Defining your Configuration Extensions When defining configuration extensions we need to think a little bit about the data being brought back into the OEM. The key is to answer the compliance question within the most efficient way possible (I guess that is the same for any query). On the configuration extension page there are a couple of key points to note: • It’s most efficient to create a single configuration with multiple SQL statements in (it will become clear later why) • Use a standard incrementing naming convention for your Alias • Use the select target and preview to test you configuration extensions against a 12c OEM target
34 – OTech Magazine #6 – winter 2014
With the first query, we are just bringing back too much information, we need to be selective. We only really need a single column single value returned. This is important when we start to write the compliance checks, they are much less flexible than configuration extensions so you want to make your job as easy as possible here.
defining custiom compliance rules using 12c oem lifecycle management pack 4/7 Phillip Brown The next query brings back less data, but raises another issue. Here we want to know if we have enabled audit on DROP ANY TABLE. The query is simple, asking have we got auditing enabled for that system privilege, but the two rows returned is an unexpected value, it should be one shouldn’t it? If we look at the table DBA_STMT_AUDIT_OPTS we can see the problem. We have auditing enabled for specific users which although is good would actually give us a false positive in this case. Thinking we have the auditing option enabled but in reality it is not auditing exactly what we want to know. We want it enabled globally, i.e. where USER_ NAME is NULL.
The final configuration extensions brings back nothing, which you may think is ok, but it’s actually an issue. If your configuration extension is bringing back nothing it means it cannot be evaluated, therefore it is compliant. Regardless of what the configuration extension is, it ALWAYS need to bring back a value.
So this is really why it is very very important to clarify the ‘questions’ you are asking prior to developing configuration extensions and subsequent compliance rules. With that in mind our queries are updated to this:
Deploying Configuration Extensions With that now completed we can deploy this configuration extensions and start to look at some of the data stored within the 12c OEM repository. Deployment of a configuration extensions is completed via the Actions > Manage Deployments menu item on the configuration extensions main page. If you are undertaking a mass deployment across you estate then be mindful of targets which are under blackout as those configuration extensions will not be deployed.
So now that configuration extensions are deployed we should be able to see not only the ‘collection’ of this information but the information stored as well. Once a configuration extensions is deployed to a target you can see it in the list of agent scheduled jobs. The name of the con-
35 – OTech Magazine #6 – winter 2014
defining custiom compliance rules using 12c oem lifecycle management pack 5/7 Phillip Brown figuration extension is translated into an internal name however it keeps the OTECH.. section making it easy to locate.
select cm_target_name, cm_snapshot_type, data_source_name, attr, value from mgmt$ccs_data where cm_target_name=’DBTEST’
Executing this command on the agent: emctl status agent scheduler | grep TECH 2014-11-18 21:01:07.983 : oracle_database:DBTEST:ccs_c_TECH_EXTENS_08100983F47B00C0E0 531EDFF56893FA
All configuration extension data is stored in the MGMT$CCS_DATA table. The name of the scheduler items is the same as the SNAPSHOT_TYPE in MGMT$CCS_DATA table. Another key thing to note is that the ATTR stores the COLUMN_NAME of the configuration extension query and the VALUE is data returned by configuration extension. Therefore each COLUMN of you configuration extension will be stored as a ROW within the database. This is why configuration extensions need to be kept as simple as possible and really just return single column values as you can tie yourself up in knots without even trying. Finally it is worth nothing that the configuration extensions CM_SNAPSHOT_TYPE is unique per configuration extension and all the subsequent ALIAS’s are stored as the under the column DATA_SOURCE_NAME. When you are trying to pull lots of compliance data into 12c OEM it is worth trying to use as few configuration extensions as possible as it will ease the creation of compliance rules and also keep the number of collections on the agent down to a minimum.
36 – OTech Magazine #6 – winter 2014
Mapping Configuration Extensions to Compliance Rules
Now we have the configuration extensions defined and deployed we can now create the compliance rules. When you start to generate the compliance rules there are some key things to note. This is really around the section ‘Define Query’ and ‘Define Validation’. When you come to define query the best place to start using the data modeller. From here you can select the target type and you will be able to define the configuration extensions you have already defined. Be sure to select the ‘Parsed Data’ as this is the actual data which you need to
defining custiom compliance rules using 12c oem lifecycle management pack 6/7 Phillip Brown validate. The key columns from the parsed data are ‘Data Source’ and ‘Value’. Once you have selected these items you don’t need to specify the contents at this point. Click on OK and when you come out of the data modeller you will see the query generated.
SELECT s1.target_guid , ‘attrvalue’ AS info , s2.DATA_SOURCE_NAME , s2.VALUE FROM MGMT$CCS_DATA s2 , MGMT$ECM_CURRENT_SNAPSHOTS s1gen1, MGMT$TARGET s1 WHERE ( s1gen1.TARGET_GUID = s1.TARGET_GUID AND s1gen1.ECM_SNAPSHOT_ID = s2.ECM_SNAPSHOT_ID (+) AND s1.TARGET_TYPE = ‘oracle_database’ AND s1gen1.SNAPSHOT_TYPE = ‘ccs_c_OTECH_EXTENS_08100983F47B00C0E0531EDF F56893FA’ )
This query can now be used as the basis for ALL your compliance rules. If you execute this query in SQLPlus you will see it will bring back all your configuration extensions. What you can now do is for each compliance rule add in the relevant data source name (ALIAS) which you defined in each configuration extension. If you have a standard naming convention and a single configuration extension this becomes a very quick process. SELECT s1.target_guid , ‘attrvalue’ AS info , s2.DATA_SOURCE_NAME , s2.VALUE FROM MGMT$CCS_DATA s2 , MGMT$ECM_CURRENT_SNAPSHOTS s1gen1, MGMT$TARGET s1
37 – OTech Magazine #6 – winter 2014
defining custiom compliance rules using 12c oem lifecycle management pack 7/7 Phillip Brown WHERE ( AND AND AND AND
s1gen1.TARGET_GUID = s1.TARGET_GUID s1gen1.ECM_SNAPSHOT_ID = s2.ECM_SNAPSHOT_ID (+) s1.TARGET_TYPE = ‘oracle_database’ s1gen1.SNAPSHOT_TYPE = ‘ccs_c_QTECH_EXTENS_08100983F47B00C0E0531EDFF56893FA’ s2.DATA_SOURCE_NAME = ‘OTECH_1’ )
When defining the validation of that data, i.e. if it is compliant or not then, all you need to do is select the VALUE column and define the value it would return when there is a violation. For this rule, is the SYSTEM locked, you specify the condition in which it is seen as a VIOLATION, i.e. when it is OPEN. The only other thing to note is with numeric checks which are based on a COUNT(*); in some cases zero will define a violation (i.e. when auditing on an object isn’t enabled) and it others it will be compliant (i.e. when a user doesn’t have a grant to a SYSTEM PRIVILEGE). When it comes to defining custom compliance rules in 12c OEM you need to consider the following: • Translate all your compliance rules into SQL checks • Determine for each check what the value of the violation would be • Group SQLs together in configuration extensions and use a pre-defined naming convention for the aliases • Prior to generating any compliance rules ensure the configuration extensions are returning the ‘expected’ results from a selection of database targets
38 – OTech Magazine #6 – winter 2014
Cato Aune www.sysco.no/en/stikkord/ middleware
Starting WebLogic
39 – OTech Magazine #6 – winter 2014
Starting web logic 2/10 Cato Aune WebLogic is an advanced and feature rich application server. Unfortunately that also makes WebLogic a bit complex, and often it is possible to achieve the same task in different ways. Starting WebLogic is one of the tasks that can be done in several different ways. Each ending up with a running WebLogic instance, but the result could still be a bit different, and not exactly what you really wanted. Unlike some other application servers, there are no out-of-the-box scripts for automatically start of WebLogic when the server boots. But the documentation and scripts are getting better for each release, so maybe in a few years’ time, you might get everything set up by the installer. This article will guide you thru different options on how to start WebLogic, along with a recommended method and a set of sample scripts, a bit simplified, but that works. For the purpose of demonstrating the concepts, some more advanced items are not part of the scripts, like error handling, possibility to start several servers at once (start the WLST engine only once, limits overhead) and starting the whole cluster in one operation.
Why use bootscripts? There are several reasons for using bootscripts. One very important reason is that the server is started the same way every time. Why is that important? Well, if you start WebLogic with different users, at some point
40 – OTech Magazine #6 – winter 2014
WebLogic won’t start anymore. One common mistake is that someone starts WebLogic as user root. That will work, it is not recommended, but everything should run fine. Until you next time try to start WebLogic as a regular user (i.e. oracle or weblogic). Some important files have changed ownership to root, and the user you try to start WebLogic with, doesn’t have access to those files anymore. Another issue you could run into, is that depending on how you start WebLogic, actual runtime configuration could be different, more about that later on. And if you like to avoid the call at 3 am from the operation center, setting up automatic boot or give the operations team just one command to run after server reboot, increases your chances for a good night sleep.
Components There are three important components in booting a WebLogic server • Node Manager • WebLogic Scripting Tool (WLST) • Shell scripts Node Manager is a WebLogic Server utility that let you • Start • Shut down • Restart Administration Server and Managed Server instances.
Starting web logic 3/10 Cato Aune Before WebLogic 12.1.2 there were one Node Manager per server (host), while from 12.1.2 and onwards the default is one Node Manager per domain, and the configuration is kept under each $DOMAIN_HOME
When you have decided on how you want to start WebLogic, stick with that decision. Using different methods each time will give you some interesting mysteries to solve.
WebLogic Scripting Tool (WLST) is a Jython (Python) scripting tool that let you do almost everything that could be done in WebLogic. WLST could be run with a script or interactive. It is even possible to do a recording in Admin Console and get the result as a WLST file. Be sure to have the correct path before running WLST, you could set this by sourcing setDomainEnv.sh (i.e. . $DOMAIN_HOME/bin/setDomainEnv. sh, note the extra . before $DOMAIN_HOME)
Auto generated start scripts Some scripts are autogenerated during the domain creation, most are put in $DOMAIN_HOME/bin like setDomainEnv.sh and startManagedWebLogic.sh, while startWebLogic.sh is placed in $DOMAIN_HOME.
Shell scripts could be regular bash shell scripts for Linux or cmd/bat files on Windows. Shell scripts will be used for init.d/xinit.d scripts in Linux to start Node Manager and WebLogic on server boot or to create Windows services in Windows. When running the configuration wizard, default start scripts are generated and placed in $DOMAIN_HOME/bin These scripts will be used later on.
Different methods to start WebLogic • Auto generated start scripts • Using WLST and Admin Server • Using WLST and Node Manager
41 – OTech Magazine #6 – winter 2014
To start the Admin Server, you could run $DOMAIN_HOME/startWebLogic.sh and to start the managed server you could run $DOMAIN_ HOME/bin/startManagedWebLogic.sh ms1
This works well, but make sure to use nohup and put the process in the background, or the server instance would stop when you log out. $ nohup startWebLogic.sh &
Using WLST and Admin Server It is possible to start the Admin Server from WLST, but it is not recommended with Fusion Middleware suite products (according to the documentation). To use WLST and Admin Server to start managed server, there are some requirements
Starting web logic 4/10 Cato Aune • Admin Server must be running • Node Manager must be running • Admin Server must be able to communicate with Node Manager Connect to AdminServer to start managed servers
Starting directly from Node Manager does not set the environment variables mentioned above. It is possible to provide the same information manually along with nmStart.
connect(userConfigFile=userFile, userKeyFile=keyFile, url=adminUrl) start(...)
Custom WebLogic config
When starting a managed server from Admin Server via Node Manager, several environment variables are defined: JAVA_VENDOR, JAVA_HOME, JAVA_OPTIONS, SECURITY_POLICY, CLASSPATH, ADMIN_URL Using WLST and Node Manager To use WLST and Node Manager, the requirements are • Node Manager must be up and running Connect to Node Manager using nmConnect nmConnect(userConfigFile=nmUserFile, userKeyFile=nmKeyFile, host=nmHost, port=nmPort, domainName=domain, domainDir=domainPath, nmType=nmType) Start Admin Server and Managed Servers using nmStart nmStart(‘AdminServer’) nmStart(‘ms1’)
42 – OTech Magazine #6 – winter 2014
Often you want some custom config for each server, like heap size (-Xms, -Xmx) or more advanced like configure where to find the coherence cache in a high availability setup. Where to put you custom config depends on how WebLogic is started. If you start from Admin Server via Node Manager, you could place the config in Arguments on the Server Start tab in the Configuration for each server. The config will be stored in config.xml and is visible from the Admin Console.
Starting web logic 5/10 Cato Aune If you start directly from Node Manager this config will not be used (unless you provides it along with nmStart) Never put custom config in setDomainEnv.sh, startWebLogic.sh or startManagedWebLogic.sh - those files will we overwritten during upgrade. If you want to put the config in a script file, create setUserOverrides.sh in $DOMAIN_HOME/bin. In 12c that is enough, while in 11g, you have to add the following lines in setDomainEnv.sh (this is how it is done in the 12c version of setDomainEnv.sh). if [ -f ${DOMAIN_HOME}/bin/setUserOverrides.sh ] ; then . ${DOMAIN_HOME}/bin/setUserOverrides.sh fi
• It is recommended to let Node Manager use start script (StartScriptEnabled=true) • It is recommended to start from AdminServer to give server start arguments and SSL arguments to Node Manager Our approach • Enable start script in Node Manager (StartScriptEnabled=true in nodemanager.properties) • Connect to Node Manager and start AdminServer • Connect to AdminServer and start managed servers
Scripts wls.py - the actual WLST script that starts and stops WebLogic instances
Make sure you set StartScriptEnabled=true in nodemanager.properties, so the default start scripts are used when starting a WebLogic instance. If you want to use your own start scripts and not the auto generated ones, you have to put the name of the start script in StartScriptName=myStartupScript.sh in nodemanager.properties Make sure you call the other necessary scrips in $DOMAIN_HOME/bin like setDomainEnv.sh and use JAVA_OPTIONS that are set if you start from Admin Server.
Recommendations • It is recommended to always use Node Manager to start Admin Server and managed servers
43 – OTech Magazine #6 – winter 2014
import sys def startAdmin(): print ‘Starting AdminServer’ nmConnect(userConfigFile=nmUserFile, userKeyFile=nmKeyFile, host=nmHost, port=nmPort, domainName=domain, domainDir=domainPath, nmType=nmType) nmStart(‘AdminServer’) nmDisconnect() return def stopAdmin(): print ‘Stopping AdminServer’ connect(userConfigFile=wlsUserFile, userKeyFile=wlsKeyFile, url=adminUrl) shutdown(‘AdminServer’, force=’true’) disconnect() return
Starting web logic 6/10 Cato Aune def startManaged(managed): print ‘Starting ‘, managed connect(userConfigFile=wlsUserFile, userKeyFile=wlsKeyFile, url=adminUrl) start(managed) disconnect() return def stopManaged(managed): print ‘Stopping ‘, managed connect(userConfigFile=wlsUserFile, userKeyFile=wlsKeyFile, url=adminUrl) shutdown(managed, force=’true’) disconnect() return if ((len(sys.argv) < 2) | (len(sys.argv) > 3)): print ‘ Wrong number of arguments’ elif (sys.argv[1] == ‘startadmin’): startAdmin() elif (sys.argv[1] == ‘stopadmin’): stopAdmin() elif (sys.argv[1] == ‘start’): startManaged(sys.argv[2]) elif (sys.argv[1] == ‘stop’): stopManaged(sys.argv[2])
startall.sh - calls the wls.py script to start each WebLogic instance Full path for wlst.sh (under wlserver/common/bin), config.properties and wls.py must be given (not necessary for wlst.sh if wlserver/common/bin is in $PATH) wlst.sh -loadProperties config.properties - skipWLSModuleScanning wls.py startadmin wlst.sh -loadProperties config.properties - skipWLSModuleScanning wls.py start ms1
44 – OTech Magazine #6 – winter 2014
stopall.sh - calls the wls.py script to stop each WebLogic instance Full path for wlst.sh (under wlserver/common/bin), config.properties and wls.py must be given (not necessary for wlst.sh if wlserver/common/bin is in $PATH) wlst.sh -loadProperties config.properties - skipWLSModuleScanning wls.py stop ms1 wlst.sh -loadProperties config.properties - skipWLSModuleScanning wls.py stopadmin
config.properties - adminUrl must point to the AdminServer (often a VIP address) while nmHost points to the local Node Manager serving this domain adminUrl=t3://wls12c.dev.sysco.no:7001 nmHost=wls12c.dev.sysco.no nmPort=5556 nmUserFile=/u01/app/oracle/config/nmUserFile nmKeyFile=/u01/app/oracle/config/nmKeyFile nmType=plain wlsUserFile=/u01/app/oracle/config/wlsUserFile wlsKeyFile=/u01/app/oracle/config/wlsKeyFile domain=mydomain domainPath=/u01/app/oracle/user_projects/domains/mydomain
Encrypting username and password, so they can be referenced from wls. py as nmUserFile/wlsUserFile and nmKeyFile/wlsKeyFile, is different in 12c and 11g. The 11g method still works as of now in 12c, but is deprecated and it is better to use the 12c method.
Starting web logic 7/10 Cato Aune 11g method - source setDomainEnv.sh before running this command to get the paths correctly (everything on one line) java weblogic.Admin -username nodemanager -userconfigfile /u01/app/oracle/config/nmUserFile -userkeyfile /u01/app/oracle/config/nmKeyFile STOREUSERCONFIG
12c method - source setDomainEnv.sh before running this command to get the paths correctly java weblogic.WLST
Encrypt username and password for Node Manager wls:/offline> nmConnect( ‘nodemanager’,’welcome1’,’localhost’,5556,’mydomain’, ‘/u01/ app/oracle/user_projects/domains/mydomain’, ‘plain’) wls:/mydomain/serverConfig> storeUserConfig( ‘/u01/app/oracle/config/nmUserFile’,’/u01/app/oracle/config/nmKeyFile’, ‘true’)
Encrypt username and password for WebLogic admin user wls:/mydomain/serverConfig> connect(‘weblogic’, ‘welcome1’, ‘t3://wls12c.dev.sysco. no:7001’) wls:/mydomain/serverConfig> storeUserConfig( ‘/u01/app/oracle/config/wlsUserFile’,’/u01/app/oracle/config/wlsKeyFile’,’false’)
/etc/init.d/nodemanager - init.d script that starts Node Manager during server boot. This is not based on the script from the 12c documentation ( http://docs. oracle.com/middleware/1213/wls/NODEM/java_nodemgr.htm#BABJIDFD ), but is pretty similar anyway. One important difference is that this script
45 – OTech Magazine #6 – winter 2014
starts Node Manager as user oracle, while the script in the documentation starts Node Manager as root, which is not recommended. Make sure to adjust the paths in the script to fit your environment. The script must be made runnable (chmod 0755) and activated (chkconfig --add) before it will be used next time the server starts or stops. #!/bin/sh # # nodemanager Oracle Weblogic NodeManager service # # chkconfig: 345 85 15 # description: Oracle Weblogic NodeManager service ### BEGIN INIT INFO # Provides: nodemanager # Required-Start: $network $local_fs # Required-Stop: # Should-Start: # Should-Stop: # Default-Start: 3 4 5 # Default-Stop: 0 1 2 6 # Short-Description: Oracle Weblogic NodeManager service. # Description: Starts stops Oracle Weblogic NodeManager. ### END INIT INFO . /etc/rc.d/init.d/functions
Starting web logic 8/10 Cato Aune # Your WLS home directory (where wlserver is) export MW_HOME=”/u01/app/oracle/middleware/Oracle_Home” export JAVA_HOME=”/usr/java/latest” DAEMON_USER=”oracle” PROCESS_STRING=”^.*weblogic.NodeManager.*” source $MW_HOME/wlserver/server/bin/setWLSEnv.sh > /dev/null export NodeManagerHome=”$MW_HOME/user_projects/domains/mydomain/nodemanager” NodeManagerLockFile=”$NodeManagerHome/nodemanager.log.lck” PROGRAM=”$MW_HOME/user_projects/domains/mydomain/bin/startNodeManager.sh” SERVICE_NAME=`/bin/basename $0` LOCKFILE=”/var/lock/subsys/$SERVICE_NAME” RETVAL=0 start() { OLDPID=`/usr/bin/pgrep -f $PROCESS_STRING` if [ ! -z “$OLDPID” ]; then echo “$SERVICE_NAME is already running (pid $OLDPID) !” exit fi echo -n $”Starting $SERVICE_NAME: “ /bin/su $DAEMON_USER -c “$PROGRAM &” RETVAL=$? echo [ $RETVAL -eq 0 ] && touch $LOCKFILE
/bin/echo “$SERVICE_NAME is stopped” fi echo /bin/rm -f $NodeManagerLockFile [ $RETVAL -eq 0 ] && rm -f $LOCKFILE } restart() { stop sleep 10 start } case “$1” in start) start ;; stop) stop ;; restart|force-reload|reload) restart ;; *) echo $”Usage: $0 {start|stop|restart}” exit 1 esac
}
exit $RETVAL
stop() { echo -n $”Stopping $SERVICE_NAME: “ OLDPID=`/usr/bin/pgrep -f $PROCESS_STRING` if [ “$OLDPID” != “” ]; then /bin/kill -TERM $OLDPID else
/etc/init.d/weblogic - init.d script that calls startall.sh during server boot, and stopall.sh when the server shuts down. Make sure to adjust the paths in the script to fit your environment.
46 – OTech Magazine #6 – winter 2014
Starting web logic 9/10 Cato Aune The script must be made runnable (chmod 0755) and activated (chkconfig --add) before it will be used next time the server starts or stops. #!/bin/sh # # weblogic Oracle Weblogic start # # chkconfig: 345 85 15 # description: Oracle Weblogic service ### BEGIN INIT INFO # Provides: # Required-Start: $nodemanager # Required-Stop: # Should-Start: # Should-Stop: # Default-Start: 3 4 5 # Default-Stop: 0 1 2 6 # Short-Description: Oracle Weblogic service. # Description: Starts and stops Oracle Weblogic. ### END INIT INFO . /etc/rc.d/init.d/functions # Your WLS home directory (where wlserver is) export MW_HOME=”/u01/app/oracle/middleware/Oracle_Home” export BOOT_HOME=”/u01/app/oracle/bootscripts” export JAVA_HOME=”/usr/java/latest” DAEMON_USER=”oracle” source $MW_HOME/wlserver/server/bin/setWLSEnv.sh > /dev/null PROGRAM_START=”$BOOT_HOME/startall.sh” PROGRAM_STOP=”$BOOT_HOME/stopall.sh” SERVICE_NAME=`/bin/basename $0` LOCKFILE=”/var/lock/subsys/$SERVICE_NAME” RETVAL=0
47 – OTech Magazine #6 – winter 2014
start() { echo -n $”Starting $SERVICE_NAME: “ /bin/su $DAEMON_USER -c “$PROGRAM_START &” RETVAL=$? echo [ $RETVAL -eq 0 ] && touch $LOCKFILE } stop() { echo -n $”Stopping $SERVICE_NAME: “ /bin/su $DAEMON_USER -c “$PROGRAM_STOP &” RETVAL=$? [ $RETVAL -eq 0 ] && rm -f $LOCKFILE } restart() { stop sleep 10 start } case “$1” in start) start ;; stop) stop ;; restart|force-reload|reload) restart ;; *) echo $”Usage: $0 {start|stop|restart}” esac exit 1
Starting web logic 10/10 Cato Aune Summary The scripts shown in this article just covers the minimum of functionality needed to start WebLogic. Feel free to use the scripts “as is” or as a basis for your own enhancements to fit your requirements. Although it is very convenient to start everything automatically, you should consider if this is the best solution for you. If a database server is not up yet, the managed server that needs to connect to that database server will end up in ADMIN state. A possible solution (but not good for all) is to set Initial Capacity to 0, then the managed server should start (but it might be a bit grumpy if someone tries to use the database connection while it is not there yet). To be on the safe side, you could create a startadmin.sh script, which just starts the Admin Server, and then use startall.sh for the managed servers. Remember to edit the /etc/init.d/weblogic script and replace startall. sh with startadmin.sh. The Admin Server will now start when the server boots, and you will have to log in and run startall.sh manually. You should still be allowed to sleep thru the night, just tell the operation team to be sure all other servers are up and running, before they can run startall.sh.
48 – OTech Magazine #6 – winter 2014
Goran Stankovski www.limepoint.com
DevOps and Continuous Delivery for Oracle
49 – OTech Magazine #6 – winter 2014
Devops and continuing delivery for oracle 2/5 Goran Stankovski This article introduces the DevOps and Continuous Delivery for Oracle article series. There is a common thread amongst Oracle implementations - from the smallest implementation to the largest mission-critical implementation: those that have had an active engagement from the Operations team during implementation have been far more successful with the transition to BAU. Reflecting on this, the DevOps and Continuous Delivery approaches of today are being applied to harness the power in the collaborative and cooperative efforts of multi-team environments to deliver value tomorrow. Over the next 4 issues of this magazine, we will be talking about DevOps and Continuous Delivery; in particular the challenges, pitfalls, and key considerations in embarking on such initiatives in an Oracle context. Traditionally development and project streams deliver capabilities to an organisation via fulfilment of business initiatives through project delivery, before handing over the capability to Operations teams to run and manage. These can range anywhere from small projects and initiatives, to large transformational programmes. No matter how small or large, there are always key benefits to be realised by the business in delivery of such initiatives, or so a majority of business cases will suggest. In traditional enterprise project delivery, a project team will work on a system for many months (often years), polishing, honing, and perfect-
50 â&#x20AC;&#x201C; OTech Magazine #6 â&#x20AC;&#x201C; winter 2014
ing it until it is deemed ready to be: deployed into production; handed over to the Operations team; and then kept in place as-is until the next release. From this point onward any enhancements or remediation to these systems must be coordinated through a change management function, whose procedures are often centred on actively discouraging these activities.
Devops and continuing delivery for oracle 3/5 Goran Stankovski Operations of the delivered capability often is not on the priority list of project teams. The project teams objective is to deliver the solution in the timelines communicated by the project management team. Timelines that are often very tight, and usually defined with little technical input and/or made with large assumptions that remain invalidated until so late in the life of the project that remediation is impractical, if not impossible. This puts enormous pressure on project teams to deliver and transition the system to Operations for the on-going management. Meanwhile Operations are often left with an unmanageable solution, with which they have had little to no involvement. From the Operations perspective they’ve been thrown a “ticking time bomb” in that while it delivers an essential capability to the business, Operations cannot effectively manage, modify, remediate, or enhance it. As a side effect in a traditional enterprise delivery model, Operations teams are incentivised to dissuade and even actively impede delivery of new, improved or streamlined business function. After all, completely remaking “the world” will always carry significant operational and procedural risks associated with any system that is untried and untested in the production environment. Additionally as any new system or capability can often represent a year or more of change, it is virtually impossible to roll back if things go wrong – if not technically, often politically. DevOps (as the name suggests) is the union of Development and Operations teams, typically demonstrated through their increased collaboration
51 – OTech Magazine #6 – winter 2014
and cooperation. Implementing DevOps and Continuous Delivery methodologies in a traditional enterprise environment has many challenges. One such challenge is that DevOps can mean many different things to different people - there is no DevOps governing or certifying body, nor is there one clear and concise definition. In many respects DevOps is about attitudes and relationships, more than tooling and procedures - it is the bringing closer of Development and Operations capabilities within an organisation. One thing is clear; the greater the cohesion of Development and Operations, the greater the business value of delivered solutions and capabilities. Common objectives of DevOps include delivering capabilities with a faster time to market (or speed-to-value), reducing the failure rate of new releases, minimising the lead time between fixes, improving system quality, and achieving a faster mean time to recovery in the event of a new release impacting the current system. Continuous Delivery is the method by which such objectives can be met. Continuous Delivery is the practice of software delivery through automation, providing the ability to rapidly, reliably and repeatedly deploy software capabilities to customers with minimal manual interaction, minimal risk, and typically much higher quality. With Continuous Delivery, we continuously deliver small, understandable and reversible changes over time. Of course there is an overhead to this
Devops and continuing delivery for oracle 4/5 Goran Stankovski – you can no longer delay all testing until the end of the project; you are now doing a deployment every week (even every day) instead of every year. Therefore automated testing and deployment systems tend to be inextricably linked to these methodologies. There are a number of key DevOps and Continuous Delivery principles that we believe are imperative in ensuring that the benefits of DevOps and Continuous Delivery can be realised. Embrace change. Change is good. Delivering capability to the business increases business value. Being able to do this faster and better than your competition provides a business differentiator that can be used to capitalise on opportunity. Automate everything! Automation is a foundational building block in DevOps and Continuous Delivery, providing a repeatable, reliable, and higher quality outcome. The more manual interaction you have in your system development and operation, the slower you will be able to adapt to change. Do not avoid the “too hard” basket. If something is “difficult” or “too hard”, then do it more often – it will become easier over time. If something is hard to automate, this only increases the necessity to automate it.
52 – OTech Magazine #6 – winter 2014
Keep everything in Source Control. Adopting an “Infrastructure-ascode” approach will allow you to define your infrastructure and platform through code constructs and deliver it through automation. Storing your entire platform and application configuration in source control systems, such as GIT, will allow you to deliver consistent platforms and applications to support your business initiatives. Large numbers of small changes are superior to small numbers of large changes. Smaller, digestible changes are easier managed, and easily rolled back. Adopt the Continuous Improvement approach. Improving the Continuous Delivery and Automation capability over time will allow you to incorporate feedback into the process to deliver prolonged value. So what about DevOps and Continuous Delivery in an Oracle context? Oracle is considered an “Enterprise” class software. In Continuous Delivery and DevOps circles, enterprise software is viewed by many to be diametrically opposed to its key principles and tenants. It is inflexible, “hard to work with”, delivered on a long software release cycle, and as a result is slow to catch up with industry trends and emerging standards. This misalignment in worldview tends to have many DevOps and Continuous Delivery initiatives place enterprise software, such as Oracle, in the “too hard” basket and leave it for a later project to worry about. Many
Devops and continuing delivery for oracle 5/5 Goran Stankovski enterprises and businesses, however, rely on such enterprise software for a majority of their business capability, and as a result, the business value of DevOps or Continuous Delivery initiatives delivered excluding such enterprise software is significantly diminished.
take into account in order to deliver this capability? What does this mean with respect to Oracle Infrastructure, Virtualisation, Database, Middleware, and Applications technologies? Subsequent articles in the series will focus on this aspect.
Adopting Continuous Delivery and DevOps within your enterprise requires a fundamental change to your technical project delivery capability, which will require maturity to progress it within the organisation. In our opinion, DevOps and Continuous Delivery capability maturity can be viewed in the following three distinct phases:
1.
Destructible Development Environments
The foundational capability in DevOps and Continuous Delivery initiatives is the ability to deliver destructible development environments. How do you deliver consistent environments through a repeatable and predictable process? How do you deliver this across Oracle Infrastructure, Virtualisation, Database, Middleware, and Applications? This will be the topic of our next article in the series.
2.
DevOps Lifecycle Management
As your maturity progresses, the next step in the evolution is bringing lifecycle and release management capability through your DevOps and Continuous Delivery framework. How do you adopt DevOps principles in your organisation’s Dev/Test lifecycle? What key considerations must you
53 – OTech Magazine #6 – winter 2014
3.
DevOps and BAU
The final phase in the DevOps maturity is bringing the capability into business as usual (BAU). Considerations such as configuration change management and drift detection are critical in this phase. This is the most difficult phase to adapt, as there is typically a huge separation and disconnect between the Development and Operations BAU teams in many organisations. Most of these issues are typically organisational and not technical in nature. Trust plays a key part. How can BAU trust what is delivered? How can we bridge this divide? Over the next 4 issues of this magazine, we will be delivering articles that focus on each of the phases, and share our experience and recommendations in delivering such DevOps and Continuous Delivery capabilities to organisations that leverage Oracle technology.
Deliver your environments 10X faster Holistic approach for public & private clouds End-to-end provisioning from infrastructure to applications Configure it once and build it many
Continuous delivery automation for Oracle Infrastructure. Databases. Middleware. Applications.
Design, configure, manage and deploy your Oracle environments with speed, quality and consistency. find out more @ www.environmint.com
54 â&#x20AC;&#x201C; OTech Magazine #6 â&#x20AC;&#x201C; winter 2014
Lonneke Dikmans www.eproseed.com
Choosing the right Mobile Architecture
55 â&#x20AC;&#x201C; OTech Magazine #6 â&#x20AC;&#x201C; winter 2014
choosing the right mobile architecture 2/9 Lonneke Dikmans Deciding you need to create a mobile app is a no brainer these days. However, once this decision is made the next one pops up immediately: what architecture should I chose? Native? Web? Hybrid? In this article you will learn about the differences between these three architectures and the impact the mobile architecture has on your service architecture, on security and last but not least on scalability. Three use cases are described to guide you through the decision process.
Mobile Architecture types There are three types mobile architecture:
do that, you have access to device specific capabilities like the camera. Web applications are the opposite of that; they reside on the server and are accessed using a browser. This makes them independent of device type and OS version. You are limited with respect to the features you can use by the capabilities of the browser and HTML5, but can use any tool that supports web applications. Hybrid apps are a mix between Native apps and Web apps. You can download them from the App store, or Google Play etc. Some parts are programmed in the native app language; some parts are residing on the server and are shown using embedded web views. You can access both Native apps and Hybrid apps online and offline, as long as everything you need is available offline. Web apps you can access online only. The table below summarizes the most important features of the three architectures. Native App Resides on the device Download from ‘App store’ Native views, depending on device and OS version Access to all device features
Illustration 1. Mobile architecture types
Native apps are specifically built for a specific device and depend on the specific operation system (OS) version. For example, you can build an iPhone app for iOS 8.2 using XCode to program Objective C. When you
56 – OTech Magazine #6 – winter 2014
Online and offline
Web App Resides on the server URL Browser, depending on Browser version Limited functionality, depending on HTML5 Online
Hybrid App Device and server Download from ‘App store’ Native views and Embedded Browser Access to all device features Online and Offline
Oracle Mobile Application Framework offers capabilities to build native apps, hybrid apps and web apps. Apart from that you can easily access SOAP or REST services and you can send and receive push messages.
choosing the right mobile architecture 3/9 Lonneke Dikmans
Illustration 2. Overview of Oracle Mobile Application Framework (source: Oracle Mobile Application Framework datasheet)
As you can imagine, maintaining native apps for different devices and different versions of the OS is expensive and cumbersome. That is why Oracle Mobile Application Framework is very powerful: you can build your application once and deploy it to different devices. Apart from the device type, it allows you to choose the size of the device and the OS version. You can declaratively define icons and other device specific features for the entire app. At deploy time these will be generated into your native code.
Illustration 3. Creating native apps from one code base
Of course this does not take away the burden of deployment or releasing for different platforms, but at least you can work from the same codebase.
Impact on your services Mobile architecture is not just important for the app itself; it impacts your service landscape as well.
57 â&#x20AC;&#x201C; OTech Magazine #6 â&#x20AC;&#x201C; winter 2014
choosing the right mobile architecture 4/9 Lonneke Dikmans Protocol
Granularity
Traditionally organizations have created services using SOAP and WSDL for integration purposes between systems. This can be cumbersome when creating graphical user interfaces for the web. This is even more the case when creating mobile applications. REST is a more lightweight, natural fit for mobile apps. There are two flavors that are used in REST: REST/ XML and REST/JSON. The table below shows the most important differences between the different styles. More and more tools support JSON: you can transform JSON objects to XML using SOA Suite or OSB, you can store native JSON objects in the database, etc. This means that JSON slowly becomes language agnostic, like XML.
When you think about the granularity of your services, you have to take the form factor (how big is the screen on my device and how do I access it or input data) into account. When you design for an iPad, you have almost as much ‘real estate’ as on a laptop.
SOAP Based on operations Described by WSDL Coarse grained Message security plus ‘regular’ web application security (in case of HTTP) Platform and language agnostic
REST / XML Based on objects Described by WADL Fine grained (micro-‐services) ‘regular’ web application security
REST/JSON Based on objects Described by WADL Fine grained (micro-‐services) ‘regular’ web application security
Platform and language agnostic
Platform agnostic, programming language specific (JavaScript)
It depends on the framework you are using how easy it is to consume SOAP or REST services. In general REST is more flexible and easier to use in mobile applications.
58 – OTech Magazine #6 – winter 2014
Illustration 4. Difference in available ‘real estate’, or room, for data on a screen
However, when you are designing an app for a mobile phone or smart watch, this changes rapidly. Typically, a mobile phone is chatty in terms of service communication: it calls small services often, because there is less room for data on the screen and because data is often paid by volume.
choosing the right mobile architecture 5/9 Lonneke Dikmans The following items determine the granularity of the services you need: • Form factor. For smaller screens (both in device-size and resolution) you want small services to avoid using to much bandwidth or memory • On route versus stationary. When you are on route (walking to your car) you spend less time on a transaction compared to when you are stationary (sitting at home on the couch) • Hands free or not. When you are driving, you need voice input. When you are on the couch you can type • Online versus Offline. If data needs to be cached you need bigger services to make sure you fetch as much as you can before you lose the connection.
Reuse versus specific services As you have seen in the previous two paragraphs, the need of mobile app as service consumer differs per type of mobile app, based on the form factor (size/device) and context. You can solve that in two ways: reuse your current services and filter the data in your app, or create specific services for the different contexts.
Illustration 5. Reuse versus specific mobile services
Because in a mobile web app the code is residing on the server, it is relatively easy to reuse the services you already built in the backend and filter the data in the web application, before they get sent over the wire. You can create a responsive application that behaves differently based on the device that the browser is running on. When you are building a native app you probably want to filter the data on the server, building a specific layer of services in the backend (‘presentation’ services). This avoids unnecessary delays and sending over too much data to your device. The exception to this guideline is when you need to cache a lot of data on your mobile device; in that case you probably want to call coarser grained services. Note that in both cases you can reuse the original services in your backend and you should make sure they contain the business logic, to prevent copying logic to multiple places
59 – OTech Magazine #6 – winter 2014
choosing the right mobile architecture 6/9 Lonneke Dikmans Security considerations When you decide to create a mobile app using native or hybrid technology, there are a number of things you need to take into account. This includes: • Network security: encryption; • Mobile Device Management (MDM) versus Mobile Application Management (MAM); • Service side management: authentication and authorization of your services in the back end and management of the services or API. Let’s take a look at mobile device management, mobile application management and service management: MDM Secure device password VPN between device and server Wipe entire device Track device Native and hybrid Dedicated devices
MAM Secure container password Secure tunnel between app container and server Wipe application Track application Native and hybrid BYOD
Service management Secure service Transport level security using SSL (for example) Protection against DOS Track service use Web App BYOD
Oracle offers mobile security suite to take care of your security considerations for Enterprise grade mobile apps by ‘containerizing’ them. On top of that you can secure and manage your services (using API gateway) and Oracle Identity and Access Management feature. This is shown in the figure below.
60 – OTech Magazine #6 – winter 2014
Illustration 6. Oracle Mobile Security Suite (source: Extending Enterprise Access and Governance with Oracle Mobile Security)
Scalability The introduction of mobile devices has lead to a huge increase in load in services that used to be very stable. Examples of this are: • Bank apps to check your balance. In the past people would get a bank statement periodically; the bank was in control of the information. Then Internet banking and ATMs entered the arena, allowing people to check their balance when getting cash or when behind the computer at home. With the introduction of mobile banking, customers can check their balance before they make any major or minor purchase. This has increased the load on the backend banking systems greatly. • Travel apps to check flight times. A similar effect has happened with train and plane schedules. People always carry their mobile device and check times frequently when traveling. This has increased the load on the backend systems.
choosing the right mobile architecture 7/9 Lonneke Dikmans There are several solutions to handle the increase in load: • Cache the response. Data can be cached on the device or on the service bus or API manager. This decreases the load if the same customer checks the same data repetitively. • Load balance the service in the backend. The servers in the backend can be scaled out, so the additional load can be balanced over multiple servers. • Smaller services (less data) for mobile use. Depending on the underlying backend architecture, creating smaller micro-services might decrease the load on the system because there is no excess of data being sent to devices.
Field engineer fixing a utility meter
Use cases
In this case, the main reason to choose a native app, was the need to be able to work offline and cache data. Field engineers work in areas where there is sometimes no connection to the Internet. Because the device is a special purpose device, mobile device management was sufficient: • If a field engineer leaves the job, or looses the device the entire device is wiped clean. • There are no other apps on the device than the ones installed on their by the utility company
Now that you have seen some considerations you have to take into account when you decide to build a mobile app, let’s take a look at some practical examples.
61 – OTech Magazine #6 – winter 2014
Illustration 7. Choosing the right mobile architecture: Field engineer fixing a meter
choosing the right mobile architecture 8/9 Lonneke Dikmans Judge reading court files
Engineer looking for a temp job
Illustration 8. Choosing the right mobile architecture: Judge reading case files
Illustration 9. Choosing the right mobile architecture: Engineer looking for a temp job
When people work from home, they are mostly working online. For judges, security demands are high and theyâ&#x20AC;&#x2122;d like to read files on a tablet. The solution is an hybrid app that works the similar to the browser version for the desktop. The functionality is residing on the server. The organization has chosen Mobile Device Management, because they hand out the devices. There is no BYOD device policy in place. If there had been a BYOD policy, MAM would have been better. Specific presentation services are created for specific channels. There are special services for mobile (smart phones). For the tablet, the services for the Browser are reused.
In another case, field engineers who travel the world are looking for their next job. They have time when at the airport, in a cab or on a train. Usually they are connected (by Wi-Fi) and security demands for the application are relatively low. Jobs are posted and people can post resumes on the site. In this case a web-app is chosen, using OAuth. Presentation services were created to cater for the different form factors and to make sure that the app is fast enough.
62 â&#x20AC;&#x201C; OTech Magazine #6 â&#x20AC;&#x201C; winter 2014
Summary and conclusion Everybody is using his or her device all the time nowadays. When building enterprise grade mobile apps, you need to take into account what type of architecture you are going to use: native, hybrid or webapp. The biggest difference between these three architectures is where the logic or app
choosing the right mobile architecture 9/9 Lonneke Dikmans resides: on the device or on the server. Depending on the device features you need access to and the connectivity demands, one of the three is the best fit. Because we are talking about enterprise grade apps, security is an important feature. When choosing native or hybrid apps, you need to think about whether you want to use Mobile Application Management or Mobile Device Management. In general Mobile Application Management is more fine-grained and more user friendly. It is very suitable for BYOD situations. The mobile security suite integrates with Oracle Identity and Access Management, so that your mobile app can truly be a part of your enterprise solution. Apart from Mobile Application or Device Management, you need to manage the APIs that your mobile app calls; often the APIs are used by multiple (even third party) apps and web applications and you don’t want to rely on the diligence of the app programmer for the security of the services and APIs. When building mobile applications, the services in your organization are impacted too. There are a number of design choices you have to make: • Stick with SOAP services or move to REST services; • Building an extra layer of services to cater for specific mobile app needs (caching, smaller micro-services etc.) or solve this in the app itself; • Scale out or cache to cater for the increase in load.
63 – OTech Magazine #6 – winter 2014
Using tools to help solve these issues is one part of the equation. However, making the right architectural decisions is the critical success factor that determines whether your company will be successful in taking advantage of the possibilities that arise from modern technologies.
Biju Thomas www.oneneck.com
Flashback Empowering Power Users!
64 – OTech Magazine #6 – winter 2014
flashback empowering power users! 2/9 Biju Thomas Most of the time when a developer or application administrator comes to the database administrator with a table restore request, the standard answer is “No!” We cannot blame the DBA for the answer either, because the answer is based on the time it requires to restore a table from the full backup. If the database is few terabytes big and if there are not many tablespaces in the database, the DBA has to restore a subset of the database (system, undo tablespaces and the tablespace holding the table) to a temporary location, export the table, and import to the target database. It is not usual for the DBA to have the space sitting around to create the temporary database for export. So, the process gets even longer, starting with identifying the tablespaces to restore, requesting space, identifying the backups (full, incremental, archive log), performing the restore and recovery, and exporting the table. Let us learn what alternatives are available for power users themselves, without bothering the DBA… Sometimes we get lucky, especially if the damage was done in the last few hours or minutes.
Introduction to Flashback Oracle introduced the automatic undo management in version 9i of the database. Ever since the flashback operations are available to retrieve the pre-modification state of the data in the database. Figure 1 shows the database version and the flashback feature introduced.
65 – OTech Magazine #6 – winter 2014
Figure 1: Flashback Features by Release
As you can see in Figure 1, there are many flashback features in Oracle Database. Few require DBA privileges, few have configuration requirement before you can start using, and few are enabled by default and available for non-administrators. The flashback features that help to query and restore data as it existed at a prior timestamp on individual tables are available (or should be made available) to the developers and application
flashback empowering power users! 3/9 Biju Thomas administrators. Having the knowledge and privilege helps developers and power users to act quickly to resolve the issue before the undo data disappears.
SQL> select * from hr.employees where employee_id = 106; no rows selected SQL> SELECT SYSTIMESTAMP FROM DUAL;
Let us review the flashback features for power users (and developers) and how to use them.
SYSTIMESTAMP ---------------------------------------------------07-DEC-14 03.31.22.982933 PM -05:00
The Time Machine API!
SQL> EXECUTE DBMS_FLASHBACK.ENABLE_AT_TIME(TO_TIMESTAMP(‘07-DEC-14 11:30’,’DD-MON-YY HH24:MI’));
The DBMS_FLASHBACK pl/sql package was introduced in Oracle9i Release 1 (yes, almost 15 years ago!!), when the automatic undo management was introduced in the Oracle Database. This is one of the powerful and easy to use package to go back in time, and still available in Oracle Database 12c. The “back to the future” time point in the database can be enabled by using the DBMS_FLASHBACK.ENABLE_AT_TIME or DBMS_FLASHBACK. ENABLE_AT_SYSTEM_CHANGE_NUMBER procedure. Both procedures enable flashback for the entire session. Once enabled, the queries in the session will be as of the SCN or timestamp enabled. One big advantage of using this package is to run pl/sql programs on data in the past. The following code shows setting the flashback 4 hours back, and running queries to retrieve some data accidentally deleted from the EMP table.
PL/SQL procedure successfully completed. SQL> select count(*) from hr.employees where employee_id = 106; COUNT(*) ---------1 SQL>
When a session is running in flashback mode, no DDL or DML is allowed. Only queries can execute. Following code shows the error when I try to run DML or DDL while in flashback mode. SQL> insert into hr.employees 2 select * from hr.employees where employee_id = 106; insert into hr.employees * ERROR at line 1: ORA-08182: operation not supported while in Flashback mode SQL> create table hr.emp_restore as select * from hr.employees where employee_id = 106;
66 – OTech Magazine #6 – winter 2014
flashback empowering power users! 4/9 Biju Thomas create table hr.emp_restore as select * from hr.employees where employee_id = 106 * ERROR at line 1: ORA-08182: operation not supported while in Flashback mode
SQL> exec dbms_flashback.disable;
LOOP FETCH frompast INTO pastrow; EXIT WHEN frompast%NOTFOUND; insert into hr.employees values pastrow; END LOOP; END; / PL/SQL procedure successfully completed.
PL/SQL procedure successfully completed.
SQL> select count(*) from hr.employees where employee_id = 106;
SQL> select * from hr.employees where employee_id = 106; no rows selected
COUNT(*) ---------1
SQL>
SQL> commit;
To end flashback mode, you can use the DISABLE procedure.
So, how do you retreive data using this method? The answer is PL/SQL. Let me show you with an example how to retrieve the deleted employee row using pl/sql. In the code, we open the cursor after enabling flashback, but disable flashback before start fetching. COMMIT is not included in the pl/sql code. So, after execution, we confirm if the row is inserted, and commit the record. DECLARE cursor frompast is SELECT * FROM hr.employees where employee_id =106; pastrow hr.employees%rowtype; BEGIN DBMS_FLASHBACK.ENABLE_AT_TIME(TO_TIMESTAMP(‘07-DEC-14 11:30’,’DD-MON-YY HH24:MI’)); OPEN frompast; DBMS_FLASHBACK.DISABLE;
67 – OTech Magazine #6 – winter 2014
Commit complete. SQL>
This is too much work to retrieve data. Enabling flashback for the session is good to find information and run various queries. To actually restore back the changes we should look into other flashback methods.
Flashback Query Flashback query was introduced in Oracle9i Release 2. This is the best and fastest way to save or retrieve information as it existed in the past. Flashback query is enabled by using the “AS OF TIMESTAMP” or “AS OF SCN” clause in the SELECT statement.
flashback empowering power users! 5/9 Biju Thomas SQL> select count(*) from hr.employees where employee_id = 106; COUNT(*) ---------0 SQL> SQL> select count(*) from hr.employees as of timestamp TO_TIMESTAMP(‘07-DEC-14 11:30’,’DD-MON-YY HH24:MI’) where employee_id = 106; COUNT(*) ---------1
If you do not want to restore the row directly into the table, you can save the data in a table. In fact, it is a good practice to save the data in a table, as soon as you know that some unintended data update or delete performed. SQL> create table hr.employees_restore pctfree 0 nologging as 2 select * from hr.employees 3 as of timestamp TO_TIMESTAMP(‘07-DEC-14 11:30’,’DD-MON-YY HH24:MI’) 4 where employee_id = 106; Table created. SQL>
SQL>
The required row is still available, to get it back is easier, because the session is not running in flashback mode, just the subquery is in flashback mode. SQL> insert into hr.employees 2 select * from hr.employees 3 as of timestamp TO_TIMESTAMP(‘07-DEC-14 11:30’,’DD-MON-YY HH24:MI’) 4 where employee_id = 106;
And remember, if you found the data still available for retrieve and want to replace the entire table (by doing a truncate and insert) or want to disable a constraint, do not attempt to do that before saving the data to a temporary location. If you perform any DDL on the table, flashback queries will not work anymore on that table. Here is an experience. You can see here the DBA being proactive, tried to disable the primary key before doing the insert, and… SQL> select count(*) from bt1;
1 row inserted. COUNT(*) ---------70
SQL>
SQL> select count(*) from bt1 as of timestamp to_timestamp(‘07-DEC-14 15:15’,’DDMON-YY HH24:MI’);
68 – OTech Magazine #6 – winter 2014
flashback empowering power users! 6/9 Biju Thomas COUNT(*) ---------98 SQL> alter table bt1 disable primary key; Table altered. SQL> insert into bt1 select count(*) from bt1 a as of timestamp to_timestamp(‘07DEC-14 15:15’,’DD-MON-YY HH24:MI’) where a.cid not in (select b.cid from bt1 b); ERROR at line 1: ORA-01466: unable to read data - table definition has changed SQL>
In such situations, always save the data in a temporary table, do the DDL on the table, and insert from the temporary table. The following operations on the table prevent flashback operations. • moving or truncating • adding or dropping a constraint • adding a table to a cluster • modifying or dropping a column • adding, dropping, merging, splitting, coalescing, or truncating a partition or subpartition (except adding a range partition). To use Flashback operations, you require FLASHBACK and SELECT (or READ) privilege on the table.
When there are many tables or if the table is big, it might be easier for you to save the data as export dump file, as soon as you hear about an unintended data manipulation. You can use of the expdp parameters FLASHBACK_SCN or FLASHBACK_TIME to perform the export as of a specific timestamp. These parameters along with QUERY parameter helps you to further refine and filter the data exported. For example, to export the table to save the deleted employee 106, we can use the following parameters. dumpfile=exprestore.dmp logfile=emprestore.log tables=hr.employees flashback_time=”to_timestamp(‘07-DEC-14 16:45’,’DD-MON-YY HH24:MI’)” query=”where employee_id = 106”
Flashback Table If you are very certain that the entire table must be reverted back to a state in the past, you can use the Flashback Table feature. For the flashback table to work, the row movement must be enabled. The following example shows how to rollback a table to prior time. SQL> FLASHBACK TABLE bt1 TO TIMESTAMP to_timestamp(‘07-DEC-14 16:45’,’DD-MON-YY HH24:MI’); ERROR at line 1: ORA-08189: cannot flashback the table because row movement is not enabled SQL> alter table bt1 enable row movement;
Flashback in Export 69 – OTech Magazine #6 – winter 2014
Table altered.
flashback empowering power users! 7/9 Biju Thomas SQL> FLASHBACK TABLE bt1 TO TIMESTAMP to_timestamp(‘07-DEC-14 16:45’,’DD-MON-YY HH24:MI’); Flashback complete. SQL>
Where is the Flashback Data Source So, where does the flashback data come from? Remember, in the beginning I mentioned about automatic undo management. It is a requirement for flashback operations to have automatic undo management. Automatic undo management is enabled by setting the parameters: UNDO_MANAGEMENT=AUTO → Since Oracle Database 11g, AUTO is the default setting. UNDO_TABLESPACE=<tablespace_name> → specify the undo tablespace for the instance. The default is the first available undo tablespace when the instance starts.
is relevant here, the UNDO_RETENTION. This parameter provides a guidance to the database on how long the committed undo should be kept in the undo segments. Since flashback operations are very useful feature of the database, the database tries to keep the undo records as long as possible, till there is not enough room in the undo tablespace, even past the value specified in seconds for UNDO_RETENTION (default is 900 seconds). The retention behavior depends on the tablespace data file AUTOEXTENSIBLE property. If the undo tablespace is autoextend, database tries to honor the time specified in UNDO_RETENTION. If this duration is smaller than the longest running query in the database, undo is tuned to accommodate the longest running query. When free space is low, instead of overwriting unexpired undo (committed and within the retention period) record, the tablespace auto-extends.
When any DML is run to insert, update, delete data from table, the undo record is written to the rollback (undo) segment. This record is mainly used by Oracle database to provide a read consistent view of the data. This data is also used to rollback the transaction and to run flashback queries.
If the undo tablespace is not autoextensible, the database ignores the UNDO_RETENTION parameter. Oracle tunes the undo retention to keep transactions as old as possible, to keep the undo tablespace usage below the warning alert threshold. When there is space pressure, unexpired undo is overwritten.
The duration of undo data available for the database depends on the size of the undo tablespace. Once the transaction is committed, the undo data is not required for the database operation, but is used by long running queries and flashback operations. There is one other parameter that
Oracle Undo Advisor provides the size required for undo tablespace to keep specified amount of undo retained. Undo advisor is available in DBMS_ADVISOR pl/sql package, but I prefer using Oracle Enterprise
70 – OTech Magazine #6 – winter 2014
flashback empowering power users! 8/9 Biju Thomas manager, which is much easier. Figure 2 shows the Undo Advisor screen of OEM. We can tell OEM to analyze the workload of the database for the past so many days and specify a desired undo retention, then the advisor will give the required size of the tablespace.
Other Flashback Features There are other flashback features that depend on undo data. They will come handy in some situations. Flashback Versions Query: This feature was added in Oracle Database 10gR1. The VERSION BETWEEN [TIMESTAMP | SCN} … AND … clause in the SELECT statement provides the all changes to the record within the timeframe. This is useful to audit changes.
Figure 2: Undo Advisor
It is best to configure the undo tablespace appropriately sized and fixed size. Note: If flashback operations and queries are important to you, you may ask the DBA to enable guaranteed retention. This is not a recommended configuration for production databases, as DML operations might fail if there is not enough free space, because oracle will not overwrite unexpired undo, until the time specified in UNDO_RETENTION is reached. Guaranteed undo retention is enabled for the undo tablespace, by setting the RETENTION GUARANTEE clause using the ALTER TABLESPACE statement.
71 – OTech Magazine #6 – winter 2014
Flashback Transaction Query: This feature was also added in Oracle Database 10gR1. Shows the changes made to data at a transaction level, without using LogMiner. You also get to see the SQL to undo the change. For this feature to work, the database must have supplemental logging enabled using “ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;” Flashback Transaction: This feature was added in Oracle Database 11gR1. As the name suggests, helps you to rollback a transaction and its child transactions. This is primarily a DBA feature, also requires setting up database with supplemental log data on primary key columns. Using OEM interface for transaction backout is much easier and error free.
flashback empowering power users! 9/9 Biju Thomas Accidentally dropped a table? Don’t worry! The table is available in the recycle bin, you can get it from there… (provided your DBA has not changed the default behavior of the database by adjusting the parameter RECYCLEBIN=OFF). This operation does not use undo, thus the dropped table will be available in the recycle bin until there is space pressure on the tablespace where the table used to reside. SQL> DROP TABLE job_history; Table dropped. SQL> SHOW RECYCLEBIN ORIGINAL NAME RECYCLEBIN NAME OBJECT TYPE DROP TIME -----------------------------------------------------------------------JOB_HISTORY BIN$ZcGwtQ/sKCbgQAB/AQBl2g==$0 TABLE 2014-1207:23:27:54 SQL> FLASHBACK TABLE job_history TO BEFORE DROP; Flashback complete.
72 – OTech Magazine #6 – winter 2014
Conclusion As a developer or Power user, Oracle gives you the opportunity to recover from unintended mistakes in data changes. The requirement is automatic undo management, which is the default since Oracle 11g. It is also important to size the undo tablespace appropriately to retain undo data for certain period of time. Since Oracle 10g, when a table is dropped, it is kept in the recyclebin, and can be restored. Recyclebin feature is also default since 10g. Discuss with your DBA, what you can do and the undo retention settings. If you are a DBA reading this article, it is important for you to show the developer or power user what they can do to query and save data from the past… might save you an unnecessary restore of the database or table!
Lucas Jellema www.amis.nl
The Rapid Reaction Force – real time business monitoring
73 – OTech Magazine #6 – winter 2014
the rapid reaction force real time business monitoring 2/16 Lucas Jellema Oracle offers a product that is almost like a hidden gem. Not because it is hidden – because if it is, it is hidden in plain view – but because we have collectively failed to grasp the true potential of this product. Oracle Event Processor (OEP) is typically associated with real time processing of low level events aka Fast Data. Typical examples of use cases involve stock tick processing, Tweet-analysis and Internet of Things signal evaluation. These are valid use cases – but perhaps a little exotic. There many examples closer to home where OEP can deliver excellent value. This article will introduce the product OEP, the concepts and the underlying technology. It will introduce the new StreamExplorer tool that was announced at Oracle OpenWorld 2014, soon to be released. The integration of OEP with companion components in SOA Suite 12c and the WebLogic platform is described. A simple example of using OEP for interpreting a real time stream of messages is discussed. And the article will discuss how OEP can be useful in very ordinary circumstances. Note that you may very well already have the right to make use of OEP: OEP is included in the SOA Suite 12c (and 11g). Any organization with a license for SOA Suite therefore can make use of OEP without additional investment. Additionally, for use in IoT cases, Oracle offers an embedded edition of OEP (Oracle Event Processing for Oracle Java Embedded) for a very low per-device fee and the assumption that this edition will run on a large number of small devices. OEP can also be purchased as stand-alone product.
74 – OTech Magazine #6 – winter 2014
Introducing Oracle Event Processor Before getting into technical details, we should first take a look at what events and event processing entail and what the OEP can do in fairly conceptual terms. Event is a somewhat vague term used to indicate a fairly small piece of data associated with a timestamp that is published in a fire-and-forget style. The publisher of the event volunteers the data and wants nothing in return. In fact, the publisher does not even want to know who will consume the data. In general, the data published is not apparently valuable and may only have some value by association with other – previous or future – events. An event can represent the reading of a sensor, a tweet or the latest value of a stock but it can also represent a step in a business process, a user action on a web site, an update of a database record or the execution of a business service. Note that the absence of an event can also be [made] meaningful: the fact that something that was expected, did not materialize [in time] may represent value.
Key characteristics OEP is a light weight component a with a fairly small footprint (one of the reasons it can be used in the embedded use case for example on a device the size of a Raspberry Pi). It holds relevant data in memory (or on grid); most of the processing does not require I/O operation, allowing OEP to respond very fast.
the rapid reaction force real time business monitoring 3/16 Lucas Jellema OEP is not weighed down by transactions and synchronous interactions. It operates in a very decoupled fashion – fire and forget - , both on the receiving as well as on the publishing end. The product uses smart algorithms for enrichment, filtering, partitioning, aggregation and pattern detection. Note that most operations will take place across collections of events, very much like SQL operations operate on sets of relational database records. OEP is triggered into action either by the reception of an event or by internal timers. The latter are used to monitor deadlines before which an event should have been received or to drive the periodic publication of aggregates and current status updates.
Interactions Messages are typically read from JMS—for example, a queue in WebLogic – or the SOA Suite Event Delivery Network (EDN). Tables can also be used as an event source, and through custom adapters, we can consume messages from virtually any source, including files, sockets, NoSQL database and JMX, Oracle Messaging Cloud Service, Web Sockets, RSS feeds and other HTTP streams. The OEP server provides an HTTP Pub/Sub event channel based on the Bayeux protocol; this allows OEP to consume messages pushed from a Bayeux-compliant server.
75 – OTech Magazine #6 – winter 2014
The outcomes from event processing are published to similar, decoupled destinations: JMS, EDN, tables, HTTP Pub/Sub Channels and Web Socket channels. A common companion for OEP is BAM (Business Activity Monitoring) for final processing and dashboard style visualization of event based findings, notifications and recommendations.
Stages and operations in Event processing The objective of an OEP application is to get from fine grained data events to relevant information packages. This is typically done in several steps: • Filter - Finding the needles in the haystack, those rare events that indicate exceptions
the rapid reaction force real time business monitoring 4/16 Lucas Jellema • Enrich - Adding context and thereby (business) meaning to events, such as associating an event with GPS coordinates to a shop, bus stop, restaurant, service vehicle or another meaningful location • Aggregate - Counting events or determining a sum, an upper or lower limit, or an average over a metric that the events carry in their payload. Note: An aggregate is frequently calculated over a window that consists of all events in the specific timeframe or a specified number of the latest events. • Pattern detection - Spotting a trend by comparing correlated events over a period of time, finding a special sequence of values or value changes that is known to herald a certain incident, and discovering the absence of an event that should have occurred are all examples of the detection of meaningful patterns among a string of correlated events. • Promote or propagate - The result of the preceding operations can be to propagate the event, either as is or in an enriched or promoted state, or to publish entirely new events that carry an aggregated value, some indicator, or an alert.
76 – OTech Magazine #6 – winter 2014
It is quite common that the output from an OEP application is fed into another OEP application to perform the next iteration of refinement, filtering and upgrading to an even coarser grained event with even more business value. An example of this is the combination of an embedded OEP that processes high frequency local sensor signals into less frequent aggregate values covering larger areas that are sent to centralized OEP processors for pattern analysis and aggregation across locations.
Usage Scenario The core strengths of OEP seem to be: • aggregation of low level signals • cross instance (or conversation) monitor • real time interpreter in a fast, lean, continuous and decoupled fashion that can easily be integrated with existing applications and engines. In addition to all obvious fast data use cases – high volume of messages that requires real time processing, such as social media, IoT, network packets and stock tick feeds – there are use cases within closer reach for most of us. Consider OEP the monitor that can look and analyze across threads and sessions, at transactional or non-transactional actions, at middle tier (JMS, HTTP) and database (HTTP push/pull or JDBC/SQL pull) in a decoupled way that hardly impacts the operations you would like to keep track of.
the rapid reaction force real time business monitoring 5/16 Lucas Jellema • the execution of business process is orchestrated by engines such as BPEL and BPM with contributions from various systems and components; the start and completion of each step in a process can be reported in an event. OEP can easily be used to spot process instances that are interrupted or that skip over a step (missing event). When there is an order dependency between process instances or “a limit on the number parallel instances of a certain process”, OEP can be used to monitor these. A decrease in performance for a certain resource and the build-up of a waiting queue can be detected as well • the navigation and actions of users through a web application can be reported; through OEP, we can easily get insight in the numbers of current sessions and the activities in each of them. We can learn about search terms that result in desired follow-up steps and outcomes and frequently used terms that do not end in success. We can learn about the step at which users typically abort their actions (by detecting the missing event) as well as the steps that take long or short. • when you consider certain combinations of activities suspicious, then OEP is a perfect tool to detect instances of suspicious behavior and allow immediate intervention; all it takes is publish an event for each activity that can be part of a chain of actions that indicates something fishy could be going on and – the hardest part – describe the patterns to look out for. Examples are multiple concurrent logins from the same IP address, activities at times or from locations that are quite different from a user’s normal behavior, logins from the same user from entirely different geographical locations (within a short period of time).
77 – OTech Magazine #6 – winter 2014
• OEP is straightforward to setup for detecting sudden peaks or gradual increases – for example in web traffic, system usage, response times, error rates and it can easily partition by specific characteristics. Any type of behavior that can be represented by simple messages can be presented to OEP for processing. Large volumes and high urgency may be involved, but that is not necessarily the case. Looking for patterns, keeping track, maintaining current status and doing so across systems and instances is the core of OEPs offering. And for this, there are bound to be many more use cases.
CQL – The Continuous Query Language The core of OEP is formed by the CQL engine. CQL is an acronym for Continuous Query Language, one of most prominent Event Processing Languages. Event processors need to be programmed. They are like query engines that need to query streams of events—not static sets of records, but incessantly changing collections of data granules. Event queries are not executed just once. They are instead attached to an event stream and continuously executed. Every new arriving event may cause the event query to produce a new result. Results are constantly being published in the form of events.
the rapid reaction force real time business monitoring 6/16 Lucas Jellema in a single CQL query. Here is a simple example of a CQL query that produces an output event whenever the number of input events in the someEventStream with a payload value larger than 15, changes, considering a time window that ranges over the last five seconds: select count(payloadValue) as countEvents from someEventStream [range 5] where payloadValue > 15
CQL was inspired by, derived from and overlaps with SQL. The widespread knowledge of SQL can be leveraged when programming event processing queries. CQL is allows us to combine event streams and relational data sources in a single query (for example, to join historical and reference data with the live event feed). As an aside: The Match_Recognize operator introduced in Oracle Database 12c to perform pattern matching on relational data sets makes use of the CQL syntax to construct the pattern query. CQL queries select from an event channel, often with some range or time window applied, using a where clause to filter on the events that are returned by the query. The select clause refers to event properties and uses functions and operators to derive results. Note that multiple channels and other data sources can be joined – yes: even outer joined – together
78 – OTech Magazine #6 – winter 2014
The number of results produced by this query cannot be predicted: it depends on the frequency and irregularity with which events appear in the event stream.
Technology Overview The Oracle Event Processor is part of the SOA Suite license. However, it is not part of the SOA Suite SCA container—it does not even run in the same WebLogic Server as the SOA Suite does. It runs on its own streamlined, lightweight server—which is POJO based, founded on Spring DM and an OSGi-based framework to manage services. This server comes with Jetty, an HTTP container for running servlets, and support for JMS and JDBC. It has caching and clustering facilities—optionally backed by an Oracle Coherence grid. OEP can handle thousands of concurrent queries and process hundreds of thousands of events per second. The average message latency can be under 1 ms.
the rapid reaction force real time business monitoring 7/16 Lucas Jellema The OEP server provides an HTTP Pub/Sub event channel based on the Bayeux protocol; this allows the OEP application to consume messages pushed from a Bayeux-compliant server as well as allows clients to subscribe to messages published by the OEP server. Clients can also publish to an HTTP channel without having to be explicitly connected to the HTTP Pub/Sub server. Event messages are typically processed by CQL processors—nodes in the event processing network (EPN) that receive messages from stream sources and act on them to produce new event messages. The outcome of a CQL query is typically a stream of events. OEP provides additional facilities, such as the ability to record events flowing through an event processing network and store them. They can later be played back from the event repository – very useful for testing. The OEP Visualizer is a web based administration console that enables you to view, develop, configure, and monitor aspects of Oracle Event Processing applications. OEP applications can be deployed, paused and resumed, and uninstalled from the Visualizer. Current operations can be monitored, including all console output. It also has support for visual and run-time CQL Query editing. Events published on an HTTP Pub/Sub channel can be monitored in the Visualizer, asynchronously pushed by the server to the browser.
79 – OTech Magazine #6 – winter 2014
Airport Car Park Management using Event Processor Let us take a look at a concrete example. We are borrowing here from the case of Saibot Airport in Lexville that is used throughout the Oracle SOA Suite 12c Handbook (Oracle Press). This airport makes a lot of money from the car parks around the airport. Close monitoring and careful management of the parking process is important for that reason and because it is an important aspect in the evaluation by travelers in airport surveys. Indicating at the right time that a car park is full, prevents people from needlessly driving in circles to find a free spot. However, too early warnings mean that people are unnecessarily forced to park their cars on the more remote car parks – not necessarily improving their experience. We will create an OEP application to process all events signaling a car entering or leaving a car park and thus deriving the number of cars currently inside the car park. The application will publish a special event when the car park is declared full. and it will publish a similar event when spaces become available again. OEP 12c applications are developed in JDeveloper (unlike OEP 11g which used its own IDE plugin for Eclipse). These applications are deployed to and run on an instance of the OEP server. This server is not part of the integrated WebLogic Server and it is separate from the SOA Suite 12c run time environment based on WebLogic. The OEP runtime is an OSGi server that needs to be installed separately.
the rapid reaction force real time business monitoring 8/16 Lucas Jellema Getting Started A new JDeveloper application is created based on the template OEP Application, called SaibotParking. An OEP project CarParking is created inside the application – based on the Empty OEP project template. The application is created with an empty Spring context file called CarParking.context.xml – which will contain definitions of beans, events, channels and processors - and a file called processor.xml that is used to store the CQL queries. The EPN Diagram visualizes the Event Processing Network that is defined in the context file. The MANIFEST.MF file finally contains OSGi definitions – for the application itself and all its bundle dependencies. Events in OEP can be defined entirely declaratively or based on a Java bean (a POJO). We will adopt the latter approach, using a Java Class CarMovementEvent which is a POJO with the properties carparkIdentifier, entryOrExit, licencePlate and arrivalTime. Based on this POJO, an OEP event type is defined. This represents the input to the application: cars arriving at or departing from one of the car parks. A second POJO represents the results from OEP: conclusions about the status of the car parks. This POJO has properties carparkIdentifier, carCount, percentageFull and averageStayDuration. The application contains a Java Class that generates the CarMovement events. This class generates at a somewhat random rate car arrival events with randomly generated licence plates. Later, more advanced incarna-
80 – OTech Magazine #6 – winter 2014
tions of this class will derive car exit events as well and will generate events covering five different car parks. This generator is used during development. When the event processing has been developed and verified, the generator can be replaced by an inbound adapter that reads the events from JMS, an HTTP channel or some other source. The output from the OEP application can be reported through outbound adapters to external receivers, such as a JMS destination, an RMI client, a Socket endpoint or the Event Delivery Network. During development and for testing purposes, it is frequently convenient to work with an internal event receiver. A simple Java Class that implements the StreamSink is be used to receive the outcomes from the event processor and writes them to the console. Java Class CarParkReporter implements that interface and writes simple logging lines based on the events it receives
The Event Processing Network The Event Processing Network or EPN ties together the event sources – in this case the event generator -, the CQL event processors and the StreamSinks that handle the outbound operations. In between the components are channels that carry events of a specific type.
the rapid reaction force real time business monitoring 9/16 Lucas Jellema The flow from the input channel directly to the stream sink carParkReporter means that each and every car movement is channeled to the Java Class, without any processing at all. The real intelligence of this network is implemented in the carMovementProcessor that processes car movement messages using the following CQL statement: IStream(select `
sum(car.entryOrExit) as carCount, car.carparkIdentifier from carMovementInputChannel as car
Correlation and Missing Event Detection OEP can do much more than just count and aggregate. It can for example correlate events, detect ‘missing’ events and derive business intelligence from them. Chaining OEP processors – within or across applications – is commonly done to derive varying conclusions from a certain base set of data signals at different levels of granularity. We will use several processors inside our application.
group by carparkIdentifier)
This query produces an update on the current number of cars per car park. CQL will publish a result in this case whenever the previous result is superseded by a new one – i.e. with every new car that enters the parking lot and every car that leaves. Note: the value of property entryOrExit is -1 for an exit and +1 for an entry of a car.
In this section, we determine for cars that leave how long they have stayed in the car park; we could use that to automatically calculate the parking fees. We next extend our summary with the average stay duration of cars. We look for cars that have out-stayed their welcome: our car parks are short stay only, intended for parking up to 36 hours. Cars that have been in the car park for more than 48 hours become candidate for towing.
Such an update of the car count is not really required for every car that enters. We may want to settle for a summary update every five seconds. This is easily accomplished by rewriting the from clause as:
Correlate events
from
carMovementInputChannel [range 1 DAY slide 5 second] as car
Here we have specified that the results from the query are to be calculated over one day worth of [input] events and should be produced once every five seconds.
81 – OTech Magazine #6 – winter 2014
One thing OEP excels at is correlating events. It can find patterns across events that arrive at various points in time. In this example, we will have OEP associate the arrival and exit events for the same car. These events together constitute the ‘car stay’ for which we could calculate the parking fee. By taking all these car stay events into consideration, we can have OEP calculate the average stay duration – per car park. The key CQL operator we leverage to correlate events into meaningful patterns is called MATCH_RECOGNIZE. The query used in the carStay-
the rapid reaction force real time business monitoring 10/16 Lucas Jellema DoneProcessor uses the following clause: ... from carInputChannel match_recognize ( partition by carparkIdentifier, licencePlate measures A.licencePlate as licencePlate , A.carparkIdentifier as carparkIdentifier , A.arrivalTime as arrivalTime , B.element_time - A.element_time as stayduration pattern (A B) define A as entryOrExit= 1 , B AS entryOrExit = -1 ) as stayed
The crucial element in a MATCH_RECOGNIZE expression is the pattern that should be detected in the event stream. In this case, the pattern consists of events A and B that both apply to the same carpark and car (through the licenceplate) as defined by the partition by clause. Additionally, for this one car, A is the arrival event (entryOrExit equals 1) and B the subsequent exit event. Whenever these two events have been found – with no other events in between with different values for the entryOrExit attribute – the query will produce an event of type CarStayDone. The stayduration measure is calculated using the pseudo column element_ time . Every event is timestamped by OEP in such a way that the difference between the element_time value for two events is equal to the time lapse (in nano seconds) between the two events.
82 – OTech Magazine #6 – winter 2014
Using the CarStayDone events published to the carstayDoneChannel, it becomes quite simple to derive the CarParkEvents that report the average stay duration for each car park – with an update published whenever a car leaves a car park. The CQL snippet below takes care of this. Note that the average duration in this case is calculated over the last 100 cars that have left the car park. <query id=”CarparkAvgStayQuery”><![CDATA[ IStream(select -1 as carCount , carparkIdentifier as carparkIdentifier , ‘’ as description , avg(carstayDuration as averageStayDuration from carstayDoneChannel [rows 100] group by carparkIdentifier ) ]]></query>
The EPN is a little bit more complex at this point.
the rapid reaction force real time business monitoring 11/16 Lucas Jellema Detect missing events and find abandoned cars As discussed before, frequently the absence of an expected event is what we should detect and what should result in a new business event to be published. In our case of airport parking, the missing event is the car that does not leave our short stay car park in time. The event generator has been extended to also come up with cars that will not depart any time soon. Our OEP application should be able to detect such cars – that do not leave within 120 seconds (on the time scale of our sample application). See the sources in project CarParking_step3 for this step. A new event was defined – AbandonedCarEvent – based on a POJO with that name. Java Class AbandonedCarsMonitor implementing the StreamSink interface was created to receive these events and write a log message for them. The abandonedCarsProcessor was added to process the events from the carInputChannel, detect the abandoned cars and report on them to the abandonedCarsOutputChannel, as shown in the next figure. The next CQL snippet is at the heart of the processor. from carInputChannel match_recognize ( partition by carparkIdentifier, licencePlate measures A.licencePlate as licencePlate , A.carparkIdentifier as carparkIdentifier , A.arrivalTime as arrivalTime
83 – OTech Magazine #6 – winter 2014
INCLUDE TIMER EVENTS pattern (A NotExit*) duration 120 seconds define A as entryOrExit= 1 , NotExit AS entryOrExit!= -1 ) as abandoned
Again, we inspect the CarMovementEvents on the carInputChannel. We are looking for the pattern A – entry of a certain car – followed by the NotExit event – which has been defined as an event for which the entryOrExit value is not equal to -1. A car that has entered the car park can do nothing than at some point leave that car park again generating an event with entryOrExit = -1 at that point. And if it does, it is not abandoned. OEP allows us to mix in system generated timer events with the regular event stream – using the INCLUDE TIMER EVENTS instruction, combined with the clause DURATION 120 SCONDS. These tell OEP that in every partition of the event stream – for every licence plate and car park combination for which we have seen a first CarMovementEvent – a system event should be added every two minutes if and when in there has not been a real event in that two minute timeframe . Subsequently, when such a timer event is added for a certain car – because for 120 seconds there has not been a real event, which should have been the ‘car exit’ event – the pattern will match for that car because the system generated timer event does not have entryOrExit equals -1. In fact, timer events do not have attribute values at all.
the rapid reaction force real time business monitoring 12/16 Lucas Jellema Credit Card Theft Detection Another interesting use case for OEP concerns the shopping area at Saibot Airport, where a series of credit card thefts have taken place. Upon investigation, it became clear that on each occasion, the perpetrator stole the credit card in the crowded main terminal area, made his way towards the exit and in between, within 15 minutes, made three or more purchases with the credit card in the range of $200-$500. He sometimes back traced a little, entering a shop a little further from the main exit than the previous one. OEP is set up to process a stream of credit card purchase events, that contain the amount, the credit card number and the shop identifier. With OEP, the pattern described above and shown in the next figure is to be detected. When it is found, it is to be reported in the form of an EDN event to be taken care of by a SOA composite.
The salient aspects of this CreditCard processing application: • Use of the listAgg operator to aggregate multiple events into a Java Collection • Use of custom Java Classes from within CQL queries • Use of XMLParse to add XML content inside XML Element with child elements, based on a string [prepared in a custom Java class from a List of shop identifiers and purchase amounts] to produce richer, more complex EDN event payloads • Use of keywords within and subset in the CQL query to respectively find a pattern only if it occurs within a certain period of time (15 minutes) and to allow aggregation over all or some of the events that constitute the pattern. This next snippet shows some crucial aspects of the CQL query that finds fishy transactions: select its.creditCardNumber as creditCardNumber,totalSum as sumAmounts , delta as shoppingDuration, shops as shops, purchaseAmounts as purchaseAmounts from creditCardMidRangePurchaseChannel MATCH_RECOGNIZE ( PARTITION BY creditCardNumber MEASURES C.creditCardNumber as creditCardNumber , sum(allPurchases.amount) as totalSum , C.element_time - A.element_time as delta , listAgg(allPurchases.shopIdentifier) as shops , listAgg(allPurchases.amount) as purchaseAmounts PATTERN ( A+ B? A+ C) within 30000 milliseconds SUBSET allPurchases = (A,B,C) DEFINE A as A.shopIdentifier >= A.shopIdentifier
84 – OTech Magazine #6 – winter 2014
the rapid reaction force real time business monitoring 13/16 Lucas Jellema , B as B.shopIdentifier < A.shopIdentifier and A.shopIdentifier B.shopIdentifier < 7 ,C as A.shopIdentifier - C.shopIdentifier < 7 ) as its
The messages produced by the OEP application are published to a JMS queue. An ADF web application was developed to demonstrate how these findings could be visualized to notify human actors. The next figure shows how the ADF DVT component custom thematic map was used to visualize the locations and relative sizes of all transactions in a suspicious train of events.
85 – OTech Magazine #6 – winter 2014
StreamExplorer Stream Explorer is a tool – first showcased at Oracle OpenWorld 2014 targeted at the Line of Business User (the non-technical IT consumer). This tool provides a visual, declarative, browser based wrapper around Oracle Event Processor and Oracle BAM. With Stream Explorer (assumed to be available close to the turn of the year) it is every easy to create explorations and dashboard on live [streams of]data – reporting in real time on patterns, correlations, aggregations and deviations. As Oracle puts it:
the rapid reaction force real time business monitoring 14/16 Lucas Jellema “The Oracle Stream Explorer empowers Business Users, anywhere, for RealTime, Instant Insight of Fast Data Streaming Data Analysis. It is designed for SaaS on the Oracle Cloud. Requiring NO knowledge of Real Time Event Driven Architectures, the Standards-based Continuous Query Language (CQL), the high performance Java Event Server or the semantics of Event Stream Processing Application Models.” The following figure shows a screenshot of the Stream Explorer user interface. The business user can make use of many predefined templates that are categorized by industry. These templates represent specific ways of processing events.
86 – OTech Magazine #6 – winter 2014
The business user defines Streams for the input. Such a Stream can be a (CSV) file, a HTTP Subscriber, EDN event, a JMS destination, a Database Table or a REST service. These streams carry shape instances. A shape is a record (event) definition that consists of the properties and their types. The Streams can perhaps be preconfigured for the LoB user by the tech savvy IT colleague. The business user can then define Explorations on the Streams, that are based on predefined templates – as shown in the next figure.,
the rapid reaction force real time business monitoring 15/16 Lucas Jellema The business user has to provide values for template parameters such as the aggregation interval, the aggregation function or the event property to group by. Multiple explorations can be chained; this means that the outcome of one exploration can provide the input for a next. Downstream destination to which the outcome of an exploration can be sent are a CSV file, an HTTP Publisher, EDN, a JMS destination, a REST service or BAM – the Business Activity Monitor.
Summary
StreamExplorer is a web application that is installed in the Jetty Server that also runs the OEP engine and the OEP Visualizer Admin Console. OEP applications configured using StreamExplorer can be exported to be imported and refined |extended | augmented in JDeveloper by the IT specialists. The StreamExplorer applications contain an Event Processing Network and CQL based event processors that were constructed using templates.
Event processing applications use CQL – the continuous query language that is similar in syntax to SQL and that can join event channels, caches and relational data sources to produce coarser grained, higher value output events. OEP connects on both inbound and outbound edges with SOA Suite and Java applications through JMS, Coherence and the Event Delivery Network. By focusing on the core competencies of OEP – light weight, multi-channel interaction, cross instances|sessions|transactions| conversations – it is easy to see how for example OEP can play an important role for monitoring of and across business processes. Note that the core of the OEP engine has been embedded inside Oracle BAM to help with precisely that.
The near future of StreamExplorer holds interaction with NoSQL sources and destinations, BAM visualizations, support for machine learning algorithms, data redaction on sensitive data, Coherence cache integration and location based features. This will make StreamExplorer – and OEP – more powerful and more accessible for use in every-day situations instead of only in the corner cases Complex Event Processing was typically considered for.
87 – OTech Magazine #6 – winter 2014
Real time findings – insight, alert, action - based on vast and possibly fast data from a variety of sources – published to various types of channels for onwards processing is in a nutshell what Oracle Event Processor can do for us. By constantly scanning streams of events arriving at unpredictable moments, OEP is able to look out for abnormal conditions, meaningful patterns, relevant aggregates and even important event absences.
Compared to the 11g stack, development for OEP 12c is much easier thanks to the integration in JDeveloper. The support for the Event Delivery Network in the SOA Suite means that it is easier still to implement EDA (Event Driven Architecture) with SOA Suite or at least make the service architecture more event-enabled. Finally, the new StreamExplorer
the rapid reaction force real time business monitoring 16/16 Lucas Jellema product will most likely further lower the threshold for starting to leverage event processing.
Resources The source code for the Saibot Airport Car Park Management example discussed in this article can be downloaded from GitHub. Also in GitHub are the sources for the CreditCard Theft case. A presentation with a quick overview of event processing and how OEP can be used (presented at Oracle OpenWorld 2014) is available here. Example of using the Oracle Database 12c Match_Recognize clause to do CQL-like pattern matching on relational data.
88 â&#x20AC;&#x201C; OTech Magazine #6 â&#x20AC;&#x201C; winter 2014
otech partner:
For over 22 years, AMIS has been the leading Oracle technology partner in the Netherlands. AMIS consultants are involved in major Oracle implementations in the Netherlands and in various ground breaking projects worldwide. Based on a solid business case, we bring the latest technology into actual production for customers . AMIS’s reputation for technological excellence is illustrated by the AMIS technology blog. This blog services over 5.,000 visitors daily and is in the top 3 of the world’s most visited Oracle technology blogs. There are 3 Oracle ACE’s in the AMIS team, including 1 ACE Director, who make regular contributions to the worldwide Oracle community. AMIS is an Oracle Platinum Partner and was selected as the EMEA Oracle Middleware partner of the year award in 2014, and in Holland in 2013 and 2011. AMIS delivers expertise worldwide. Our experts are often asked to: - Advise on fundamental architectural decisions - Advise on license-upgrade paths - Share our knowledge with your Oracle team - Give you a headstart when you start deploying Oracle - Optimize Oracle infrastructures for performance Click here for - Migrate mission-critical Oracle databases to cloud based our review on infrastructures OOW 2014 - Bring crashed Oracle production systems back on-line - Deliver a masterclass
89 – OTech Magazine #6 – winter 2014
Patrick Barel
Lucas Jellema
www.amis.nl
www.amis.nl
Amis Edisonbaan 15 3439 MN Nieuwegein +31 (0) 30 601 6000 info@amis.nl www.amis.nl www.twitter.com/AMIS_Services https://www.facebook.com/AMIS.Services?ref=hl specializations: Oracle ADF 11g, Oracle Application Grid 11g Oracle BPM 11g, Oracle Database 11g, Oracle Enterprise Manager 12c, Oracle SOA Suite 11g, Oracle RAC 11g, Oracle Weblogic Server 12c, Exalogic
Mahir M Quluzade www.mahir-quluzade.com
Oracle Database In-Memory (Part I)
90 – OTech Magazine #6 – winter 2014
oracle database in-memory part i 2/8 Mahir M Quzulade Database In-Memory is new and powerful feature of Oracle Database coming with Oracle Database 12 Release 1 (12.1.0.2) which is published July 2014. This article series covers this amazing feature of Oracle Database.
Introduction Usually we are using analytics on Data Warehouses (DWH) which is stored Online Transactional Processing (OLTP) systems data. Analytics is complex queries running on very large tables of DWH. But DWHs is not running in real-time as OLTP. Oracle Database In-Memory (IM) can help run analytics and reports in real-time OLTP databases. Also Oracle Database In-Memory supports both DWH and mixed workload OLTP databases. Figure 1: Data Block and Row Piece format
Database Block Oracle Database has stored data as row format in other words as multicolumn records in data blocks on the disk (Figure 1). In a row format database, each new transaction or record stored as a new row in the table. A row format is ideal format for OLTP databases, as it allows quick access to all of the columns in a record since all of the data for a given record are kept together in memory and on disk, also incredibly efficient for processing DML.
91 â&#x20AC;&#x201C; OTech Magazine #6 â&#x20AC;&#x201C; winter 2014
It is not changed in latest release of Oracle Database, changes only in memory architecture it means changes only in Instance.
Instance Changes An Oracle Instance contains Memory and set of Background processes. Memory divided in two different areas: System Global Area (SGA) and Program Global Area. Oracle creates server processes to handle the requests of user processes connected to the instance. One of the important tasks of Sever Processes: read data blocks of objects from data files into the database buffer cache. By default, Oracle store data in the data-
oracle database in-memory part i 3/8 Mahir M Quzulade base buffer cache as row format. Starting in Oracle Database 12c Release 1 (12.1.0.2) added new optional static pool into SGA called In-Memory area and objects stored as new format In-Memory Column Store (IM column store) in this area. IM Column store is an optional that stores copies of tables, partitions, columns of tables, materialized views (objects specified as INMEMORY using DDL) a special columnar format optimized for rapid scans. The database uses special techniques, including SIMD vector processing, to scan columnar data rapidly. The IM column store is a supplement to rather than a replacement for the database buffer cache. The IM column store does not replace the buffer cache. But both memory areas can store the same data in different formats (Figure 2) and it is not required for objects populated into the IM column store to be loaded into the database buffer cache, in other words objects can populate only into IM column store.
Figure 2: Dual format store in memory
92 â&#x20AC;&#x201C; OTech Magazine #6 â&#x20AC;&#x201C; winter 2014
Enable In-Memory Column Store Starting in Oracle Database 12c Release 1 (12.1.0.2) added new six initialization parameters starting with INMEMORY_ prefix controlling InMemory functionality (Code Listing 1). Additionally new OPTIMIZER_INMEMORY_AWARE initialization parameter enables or disables all of the optimizer cost model enhancements for in-memory. The default value is TRUE. Setting the parameter to FALSE causes the optimizer to ignore the in-memory property of tables during the optimization of SQL statements. Code Listing 1: Initialization Parameters for In-Memory SQL> show parameter inmemory NAME TYPE VALUE ------------------------------------------------ ----------------- -------inmemory_clause_default string inmemory_force string DEFAULT inmemory_max_populate_servers integer 1 inmemory_query string ENABLE inmemory_size big integer 0 inmemory_trickle_repopulate_servers_ integer 1 percent optimizer_inmemory_aware boolean TRUE
INMEMORY_SIZE sets the size of the In-Memory Column Store (IM column store) on a database instance and the default value is 0, which means that the IM column store is not used, so In-Memory feature is not automatically enabled. We need change INMEMORY_SIZE initialization parameter to non-zero amount for enable IM Column store. If you want this
oracle database in-memory part i 4/8 Mahir M Quzulade amount not take from current SGA size, then you must extend SGA size to account for the INMEMORY_SIZE parameter value. So if you are using Automatic Memory Management (AMM), then you will need to extend MEMORY_TARGET parameter value or if you are using Automatic Shared Memory Management (ASMM), then you will need to extend SGA_TARGET parameter value. In my case using AMM and memory maximum target is equal 1024MB and setting the INMEMORY_SIZE parameter to 300MB (Code Listing 2). Code Listing 2: Enable In-Memory functionality
In-Memory Area Database mounted. Database opened.
318767104 bytes
SQL> alter system set memory_target=1324M; System altered. SQL> select * from v$sga; NAME VALUE CON_ID -------------------- ---------- ---------Fixed Size 2924304 0 Variable Size 939524336 0 Database Buffers 117440512 0 Redo Buffers 13852672 0 In-Memory Area 318767104 0
SQL> alter system set memory_max_target=1324M scope=spfile; System altered.
In-Memory Population
SQL> alter system set inmemory_size=300M scope=spfile;
Only objects specified INMEMORY attribute in DDL populate into IM Column Store.
System altered. SQL> shutdown immediate; Database closed. Database dismounted. ORACLE instance shut down. SQL> startup ORACLE instance started. Total System Global Area 1392508928 bytes Fixed Size 2924304 bytes Variable Size 1040187632 bytes Database Buffers 16777216 bytes Redo Buffers 13852672 bytes
93 â&#x20AC;&#x201C; OTech Magazine #6 â&#x20AC;&#x201C; winter 2014
Figure 3: INMEMORY clause
oracle database in-memory part i 5/8 Mahir M Quzulade You can define INMEMORY clause to specific groups of columns, whole tables, materialized views or table partitions (Code Listing 3). Code Listing 3: IM Column Store for tables and Materialized Views
Code Listing 4: IM Column Store for tablespaces SQL> create tablespace TBSIM1 datafile ‘/u01/app/oracle/oradata/prmdb/tbsim1.dbf’ size 10M default inmemory; Tablespace created.
SQL> conn mahir/mahir Connected. SQL> create table TBIM1 (id number, d date) INMEMORY; Table created. SQL> create table TB1 as select level as id from dual connect by level<=1e6; Table created. SQL> create materialized view MVIM_TB1 INMEMORY as 2 select * from tb1; Materialized view created. SQL> select table_name, inmemory from user_tables; TABLE_NAME INMEMORY ------------------ ---------------TBIM1 ENABLED TB1 DISABLED MVIM_TB1 ENABLED
SQL> select tablespace_name, def_inmemory from user_tablespaces; TABLESPACE_NAME DEF_INME ------------------------------ -------SYSTEM DISABLED SYSAUX DISABLED UNDOTBS1 DISABLED TEMP DISABLED USERS DISABLED CMDATA DISABLED TBSIM1 ENABLED SQL> create table TBIM2 as select * from dba_objects; Table created. SQL> alter table TBIM2 INMEMORY; Table altered. SQL> create tablespace TBSIM2 2 datafile ‘/u01/app/oracle/oradata/prmdb/tbsim2.dbf’ size 10M ; Tablespace created.
However, you can enable IM Column Store for tablespaces. So, all tables and materialized views in the tablespace are automatically enabled for the IM column store. Also you can enable IM Column store for all objects and tablespace after creation with ALTER … DDL command (Code Listing 4).
94 – OTech Magazine #6 – winter 2014
SQL> alter tablespace TBSIM2 default INMEMORY; Tablespace altered.
oracle database in-memory part i 6/8 Mahir M Quzulade IM Column Store enabled by default only new created objects in tablespace, if IM Column Store enabled for tablespace. In other words it is not effective for old objects of tablespace. Populated objects to In-Memory stored into two pools: a pool size of 1MB used to store the actual column formatted data and a pool size of 64KB used for store metadata about populated objects. We can check amount of available memory for each pool from V$INMEMORY_AREA view (Code Listing 5). Code Listing 5: Amount of available memory in In-Memory area
Memory Compression (inmemory_memcompress) clause specified compression method for data stored in the IM Column store. The default is NOMEMCOMPRESS, means no compresses in-memory data. To instruct the database to compress in-memory data, specify MEMCOMPRESS FOR followed by one of the following methods (Table 1): Table 1: Compression methods Compression (inmemory_ memcompress)
Description
DML
This method is optimized for DML operations and performs little or no data compression
QUERY LOW
This method compresses in-memory data the least (except for DML) and results in the best query performance. This is the default
QUERY HIGH
This method compress in-memory data more than QUERY LOW, but less than CAPACITY LOW
CAPACITY LOW
This method compresses in-memory data more than QUERY HIGH, but less than CAPACITY HIGH, and results in excellent query performance
CAPACITY HIGH
This method compresses in-memory data the most and results in good query performance.
SQL> select * from v$inmemory_area; POOL ALLOC_BYTES USED_BYTES POPULATE_STATUS -------------------------- ----------- ---------- -------------------------1MB POOL 250609664 6291456 DONE 64KB POOL 46137344 262144 DONE INMEMORY Parameters
We can specify how table data is stored in IM Column store using IM parameters. Parameters in four types: Memory Compression, Priority, Distribute and Duplicate parameters.
Recommendation: You can use Oracle Compression Advisor - DMBS_ COMPRESSION built-in PL/SQL package for an estimate of the compression ratio that can be use for MEMCOMPRESS.
95 – OTech Magazine #6 – winter 2014
oracle database in-memory part i 7/8 Mahir M Quzulade Code Listing 6: Using MEMCOMPRESS clause SQL> alter table TBIM1 inmemory memcompress for capacity high;
Table 2: In-Memory priorities Priority (inmemory_priority)
Description
NONE
Oracle Database controls when table data is populated in the IM column store. Population of data might be delayed if the memory required for the data is needed for other tasks. Also, Oracle Database might populate some table columns, but not others, in the IM column store. This is the default
LOW
Table data is populated in the IM column store before data for database objects with priority NONE and after data for database objects with priority MEDIUM, HIGH, or CRITICAL
MEDIUM
TABLE_NAME INMEMORY INMEMORY_COMPRESS ------------- ------------ -------------------TBIM1 ENABLED FOR CAPACITY HIGH TB1 DISABLED MVIM_TB1 ENABLED FOR CAPACITY LOW TBIM2 ENABLED FOR QUERY LOW TBIM3 ENABLED FOR QUERY HIGH NONE
Table data is populated in the IM column store before data for database objects with priority NONE or LOW and after data for database objects with priority HIGH or CRITICAL
HIGH
Table data is populated in the IM column store before data for database objects with priority NONE, LOW, or MEDIUM and after data for database objects with priority CRITICAL
5 rows selected.
CRITICAL
Table data is populated in the IM column store before data for database objects with priority NONE, LOW, MEDIUM, or HIGH
Table altered. SQL> alter table MVIM_TB1 inmemory memcompress for capacity low; Table altered. SQL> alter table TBIM2 inmemory memcompress for query low; Table altered. SQL> create table TBIM3 (n number,v varchar2(1)) 2 inmemory memcompress for query high; Table created. SQL> select table_name, inmemory, inmemory_compression from user_tables;
Priority (inmemory_priority) clause specified the data population priority for data in the IM column store. You can specify the priorities data in IM column store with following PRIORITY clause (Table 2):
96 – OTech Magazine #6 – winter 2014
I showed sample examples for using PRIORITY clause as below: Code Listing 6
oracle database in-memory part i 8/8 Mahir M Quzulade Code Listing 6: Using PRIORITY clause
• Objects that are smaller than 64KB are not populated into memory
SQL> alter table TBIM1 inmemory priority low; Table altered. SQL> alter table TBIM2 inmemory priority high; Table altered. SQL> create table TBIM4 (n number, v varchar2(1), d date) 2 inmemory memcompress for dml priority low; Table created. SQL> select table_name, inmemory, inmemory_compression,inmemory_priority 2 from user_tables; TABLE_NAME INMEMORY INMEMORY_COMPRESS INMEMORY_PRIORITY ----------- ------------- ----------------- ---------------TBIM1 ENABLED FOR QUERY LOW LOW TB1 DISABLED MVIM_TB1 ENABLED FOR CAPACITY LOW NONE TBIM2 ENABLED FOR QUERY LOW HIGH TBIM3 ENABLED FOR QUERY HIGH NONE TBIM4 ENABLED FOR DML LOW
Also you can use IM column store on Logical Standby database but the IM Column Store cannot be used on an Active Data Guard standby instance in the current release.
6 rows selected.
DISTRIBUTE (inmemory_distribute) and DUPLICATE (inmemory_duplicate) clauses using only with Oracle Real Application Clusers (RAC). You can read about this clause in next part of this article series. Restrictions on IM IM Column Store has some restrictions: • Index Organized Tables (IOTs) and Clustered Tables cannot be populated into IM Column Store. • LONGS (deprecated since Oracle Database 8i) and Out of line LOBs data types are also not supported in the IM Column store
97 – OTech Magazine #6 – winter 2014
Next Part In next part of this article series you will read, In-Memory scans, joins and In-Memory with RAC also Oracle SQL Optimizer with In-Memory.
Kim Berg Hansen www.thansen.dk Blog: http://dspsd.blogsport.com
Time Series Forecasting in SQL
98 – OTech Magazine #6 – winter 2014
time series forecasting in sql 2/10 Kim Berg Hansen One day our resident data analyst showed me an Excel spreadsheet and said: “Look, Kim, I’ve taken monthly sales data of one of our items for three years and created a Time Series Analysis model on it to forecast the sales of the fourth year. Can you do it in SQL to apply the model on all hundred thousand items?” Well, I listened to her explanations about centered moving average, seasonality, deseasonalizing, regression trends and reseasonalizing. And I read this link http://people.duke.edu/~rnau/411outbd.htm that explains almost the same model as she used. And yes, I could transform her Excel to SQL. Here’s how…
insert into sales values (‘Snowchain’, date ‘2013-10-01’, 1); insert into sales values (‘Snowchain’, date ‘2013-11-01’, 73); insert into sales values (‘Snowchain’, date ‘2013-12-01’, 160); insert into sales values (‘Sunshade’ , date ‘2011-01-01’, 4); insert into sales values (‘Sunshade’ , date ‘2011-02-01’, 6); insert into sales values (‘Sunshade’ , date ‘2011-03-01’, 32); ... insert into sales values (‘Sunshade’ , date ‘2013-10-01’, 11); insert into sales values (‘Sunshade’ , date ‘2013-11-01’, 3); insert into sales values (‘Sunshade’ , date ‘2013-12-01’, 5); The snowchain sells well every winter but hardly anything in the summertime and the trend is that it sells more and more every year. The sunshade on the other hand sells well in summer with a downward trend less and less every year.
A couple of representative products I’ll make a table containing montly sales for each item (product): create table sales ( item varchar2(10) , mth date , qty number );
For demonstration purposes I insert data for two items with seasonal variations: insert into sales values (‘Snowchain’, date ‘2011-01-01’, 79); insert into sales values (‘Snowchain’, date ‘2011-02-01’, 133); insert into sales values (‘Snowchain’, date ‘2011-03-01’, 24); ...
99 – OTech Magazine #6 – winter 2014
Figure 1: 2011-2013 sales for the two items
time series forecasting in sql 3/10 Kim Berg Hansen Time series I start by creating a 48 month time series for each item: select sales.item , mths.ts , mths.mth , extract(year from mths.mth) yr , extract(month from mths.mth) mthno , sales.qty from ( select add_months(date ‘2011-01-01’, level-1) mth , level ts --time serie from dual connect by level <= 48 ) mths left outer join sales partition by (sales.item) on sales.mth = mths.mth order by sales.item, mths.mth;
The inline view mths creates 48 rows with the months from 2011-01-01 to 2014-12-01 – the 3 years I have sales data for plus the 4th year I want to forecast. By doing a partition outer join on the sales table, I get 48 rows numbered with ts=1..48 for each item having QTY NULL for the last 12 months: ITEM TS MTH YR MTHNO QTY ---------- --- ------- ----- ----- ---Snowchain 1 2011-01 2011 1 79 Snowchain 2 2011-02 2011 2 133 Snowchain 3 2011-03 2011 3 24 ...
100 – OTech Magazine #6 – winter 2014
Snowchain Snowchain Snowchain Snowchain Snowchain Snowchain ... Snowchain Snowchain Sunshade Sunshade Sunshade ...
34 35 36 37 38 39
2013-10 2013-11 2013-12 2014-01 2014-02 2014-03
2013 2013 2013 2014 2014 2014
10 1 11 73 12 160 1 2 3
47 48 1 2 3
2014-11 2014-12 2011-01 2011-02 2011-03
2014 2014 2011 2011 2011
11 12 1 2 3
4 6 32
Centered moving average I place the entire previous SQL in a with clause I call s1 and query that. (This I will continue to do adding a new with clause for each step): with s1 as (...) select s1.* , case when ts between 7 and 30 then (nvl(avg(qty) over ( partition by item order by ts rows between 5 preceding and 6 following ),0) + nvl(avg(qty) over ( partition by item order by ts rows between 6 preceding and 5 following ),0)) / 2 else null end cma -- centered moving average from s1 order by item, ts;
time series forecasting in sql 4/10 Kim Berg Hansen For moving average over 12 months I can use either from -5 to +6 months or from -6 to +5 months (as seen in the two rows between clauses.) Centered moving average then means to take the average of those two averages. I only do that for those 24 months (ts=7..30=2011-07..2013-06) where 12 months of data is available: ITEM ---------Snowchain Snowchain ... Snowchain Snowchain Snowchain Snowchain Snowchain ... Snowchain Snowchain Snowchain Snowchain ...
TS --1 2
MTH YR MTHNO QTY CMA ------- ----- ----- ---- ------2011-01 2011 1 79 2011-02 2011 2 133
6 7 8 9 10
2011-06 2011-07 2011-08 2011-09 2011-10
2011 2011 2011 2011 2011
6 7 8 9 10
29 30 31 32
2013-05 2013-06 2013-07 2013-08
2013 2013 2013 2013
5 6 7 8
0 0 0 1 4
30.458 36.500 39.917 40.208
0 56.250 0 58.083 0 1
Seasonality with s1 as (...), s2 as (...) select s2.* , nvl(avg( case qty when 0 then 0.0001 else qty end / nullif(cma,0) ) over ( partition by item, mthno ),0) s -- seasonality from s2 order by item, ts;
101 â&#x20AC;&#x201C; OTech Magazine #6 â&#x20AC;&#x201C; winter 2014
The qty divided by the cma gives the factor for how much the particular month sells compared to the average month. The model will fail for months with qty=0 as that will mean some of the next steps get division by zero or get wrong results due to multiplying with zero. Therefore any 0 sale I change to a small number. Seasonality is then the average of this factor for each month. That is calculated by an analytic avg partitioned by item and month number. What this means is, that the case expression finds the factor for the months of July 2011 and July 2012 for a given item, and the portioning by month number then finds the average of those two, and that average (column alias s) will be in all 4 July rows for that item. Or in other words, seasonality will be calculated for the 12 months and be repeated in all 4 years: ITEM ---------... Snowchain Snowchain Snowchain Snowchain Snowchain Snowchain Snowchain Snowchain Snowchain Snowchain Snowchain Snowchain Snowchain Snowchain
TS MTH YR MTHNO QTY CMA S --- ------- ----- ----- ---- ------- ------9 10 11 12 13 14 15 16 17 18 19 20 21 22
2011-09 2011-10 2011-11 2011-12 2012-01 2012-02 2012-03 2012-04 2012-05 2012-06 2012-07 2012-08 2012-09 2012-10
2011 2011 2011 2011 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012
9 1 39.917 .0125 10 4 40.208 .0774 11 15 40.250 .3435 12 74 40.250 2.5094 1 148 40.250 3.3824 2 209 40.292 4.8771 3 30 40.292 .7606 4 2 40.208 .0249 5 0 40.250 .0000 6 0 44.417 .0000 7 0 49.292 .0000 8 1 51.667 .0097 9 0 53.750 .0125 10 3 54.167 .0774
time series forecasting in sql 5/10 Kim Berg Hansen Snowchain Snowchain Snowchain Snowchain Snowchain Snowchain ...
23 24 25 26 27 28
2012-11 2012-12 2013-01 2013-02 2013-03 2013-04
2012 2012 2013 2013 2013 2013
11 17 54.083 .3435 12 172 54.083 2.5094 1 167 54.083 3.3824 2 247 54.083 4.8771 3 42 54.083 .7606 4 0 54.000 .0249
Deseasonalized quantity with s1 as (...), s2 as (...), s3 as (...) select s3.* , case when ts <= 36 then nvl(case qty when 0 then 0.0001 else qty end / nullif(s,0), 0) end des -- deseasonalized from s3 order by item, ts;
Dividing the quantity by the seasonality factor gives us a deseasonalized quantity that better shows the trend of the item sales with less disturbing factors from seasonal variations in sales. Again I need to trick zero sales into 0.0001. ITEM ---------Snowchain Snowchain Snowchain ... Snowchain Snowchain Snowchain Snowchain
TS --1 2 3 22 23 24 25
MTH QTY CMA S DES ------- ---- ------- ------- -------2011-01 79 3.3824 23.356 2011-02 133 4.8771 27.270 2011-03 24 .7606 31.555 2012-10 3 54.167 .0774 2012-11 17 54.083 .3435 2012-12 172 54.083 2.5094 2013-01 167 54.083 3.3824
38.743 49.490 68.542 49.373
102 â&#x20AC;&#x201C; OTech Magazine #6 â&#x20AC;&#x201C; winter 2014
Snowchain ...
26 2013-02 247 54.083 4.8771
50.645
Trend (regression) Having the deseasonalized quantity I use regression to calculate a trend: with s1 as (...), s2 as (...), s3 as (...), s4 as (...) select s4.* , regr_intercept(des,ts) over (partition by item) + ts*regr_slope(des,ts) over (partition by item) t -- trend from s4 order by item, ts;
The two analytic functions calculate a linear regression of a graph line with des on the Y axis and ts on the X axis. regr_intercept gives the point where the regression line intersects the Y axis and regr_slope gives the slope of the line. By multiplying the slope with ts (= months) and adding it the interception point, I get the points of the trend line: ITEM ---------Snowchain Snowchain Snowchain ... Snowchain Snowchain Snowchain Snowchain Snowchain Snowchain ...
TS --1 2 3 34 35 36 37 38 39
MTH QTY CMA S DES T ------- ---- ------- ------- -------- ------2011-01 79 3.3824 23.356 32.163 2011-02 133 4.8771 27.270 33.096 2011-03 24 .7606 31.555 34.030 2013-10 1 2013-11 73 2013-12 160 2014-01 2014-02 2014-03
.0774 12.914 62.976 .3435 212.518 63.910 2.5094 63.760 64.844 3.3824 65.777 4.8771 66.711 .7606 67.645
time series forecasting in sql 6/10 Kim Berg Hansen Viewed on a graph I can get a feeling about how perfect the seasonal variations for the item is. The closer the deseasonalized graph is to a straight line, the closer the item is to having the exact same seasonal pattern every year. Small numbers near zero can give some hickups on the deseasonlized graph (like late summer every year for the snowchain), but they tend to switch to either side and even out, so the regression trend line still shows the trend pretty well. If I had not used the 0.0001 trick in the code to remove zero sales, there would have been more points on the red graph going to zero and the trend line slope would have been lower and my forecast too conservative.
Figure 2: Deseasonalization and trend of Snowchain
103 â&#x20AC;&#x201C; OTech Magazine #6 â&#x20AC;&#x201C; winter 2014
Reseasonalize (forecast) Having calculated the trend line, I can re-apply the seasonality factor: with s1 as (...), s2 as (...), s3 as (...), s4 as (...), s5 as (...) select s5.* , t * s forecast --reseasonalized from s5 order by item, ts; ITEM ---------Snowchain Snowchain ... Snowchain Snowchain Snowchain Snowchain Snowchain ... Snowchain Snowchain Snowchain Snowchain Snowchain Snowchain
TS --1 2
MTH QTY S DES T FORECAST ------- ---- ------- -------- ------- -------2011-01 79 3.3824 23.356 32.163 108.788 2011-02 133 4.8771 27.270 33.096 161.414
34 35 36 37 38
2013-10 1 .0774 12.914 62.976 4.876 2013-11 73 .3435 212.518 63.910 21.953 2013-12 160 2.5094 63.760 64.844 162.718 2014-01 3.3824 65.777 222.487 2014-02 4.8771 66.711 325.357
43 44 45 46 47 48
2014-07 2014-08 2014-09 2014-10 2014-11 2014-12
.0000 .0097 .0125 .0774 .3435 2.5094
71.380 .000 72.314 .700 73.247 .918 74.181 5.744 75.115 25.802 76.049 190.836
I got deseasonalized qty by dividing qty with seasonality factor. Then I straightened out the deseasonalized qty into a trend line. And then in this final step I multiply with the seasonality factor to get back a line shaped like the original sales quantity:
time series forecasting in sql 7/10 Kim Berg Hansen The green reseasonalized graph matches the blue actual sales pretty well for 2011-2013, and since it is calculated from the trend line, we can continue the green graph into 2014 (or further if we like) – it will keep having the correct shape, just a bit “larger” or “smaller” as it follows the increasing or decreasing trend line.
Model describes reality? I get a quick and dirty overview of model fit by analytic sum per year: with s1 as (...), s2 as (...), s3 as (...), s4 as (...), s5 as (...) select item , mth , qty , t * s forecast --reseasonalized , sum(qty) over (partition by item, yr) qty_yr , sum(t * s) over (partition by item, yr) fc_yr from s5 order by item, ts; Figure 3: Reseasonalization (forecast) of Snowchain
For sunshade we can get a similar resulting graph:
Figure 4: Reseasonalization (forecast) of Sunshade
104 – OTech Magazine #6 – winter 2014
And I spot that model might not be perfect but definitely close: ITEM ---------Snowchain ... Snowchain ... Snowchain Snowchain ... Snowchain Snowchain Snowchain
MTH QTY FORECAST QTY_YR FC_YR ------- ---- -------- ------ ------2011-01 79 108.788 331 421.70 2012-01 148 146.687
582 556.14
2013-01 167 184.587 2013-02 247 270.710
691 690.57 691 690.57
2013-11 73 21.953 2013-12 160 162.718 2014-01 222.487
691 690.57 691 690.57 825.00
time series forecasting in sql 8/10 Kim Berg Hansen Snowchain ... Sunshade ... Sunshade ... Sunshade ... Sunshade Sunshade Sunshade ... Sunshade Sunshade ...
2014-02
325.357
825.00
2011-01
4
2.885
377 390.59
2012-01
2
2.418
321 322.75
2013-01
2
1.951
263 254.91
2013-05 2013-06 2013-07
23 46 73
23.645 54.579 51.299
263 254.91 263 254.91 263 254.91
2013-12 2014-01
5
3.764 1.484
263 254.91 187.07
Differences happen where the seasonal variations are not perfect or where trend is not really linear or where numbers are too low. But it is “good enough”.
Sales and forecast together Having satisfied myself that the model is reasonably good, my end users do not want to see a lot of superfluous information – they just need the sales for 2011 to 2013 followed by the 2014 forecast: with s1 as (...), s2 as (...), s3 as (...), s4 as (...), s5 as (...) select item, mth , case when ts <= 36 then qty else round(t * s) end qty , case when ts <= 36 then ‘Actual’
105 – OTech Magazine #6 – winter 2014
else ‘Forecast’ end type , sum( case when ts <= 36 then qty else round(t * s) end ) over ( partition by item, extract(year from mth) ) qty_yr from s5 order by item, ts;
Which gives a nice and simple output: ITEM ---------Snowchain ... Snowchain Snowchain ... Snowchain Snowchain Snowchain ... Snowchain Snowchain Sunshade ... Sunshade Sunshade ... Sunshade Sunshade Sunshade ...
MTH QTY TYPE QTY_YR ------- ---- -------- -----2011-01 79 Actual 331 2012-12 172 Actual 2013-01 167 Actual
582 691
2013-12 160 Actual 2014-01 222 Forecast 2014-02 325 Forecast
691 825 825
2014-11 26 Forecast 2014-12 191 Forecast 2011-01 4 Actual
825 825 377
2012-12 2013-01
3 Actual 2 Actual
321 263
2013-12 2014-01 2014-02
5 Actual 1 Forecast 7 Forecast
263 187 187
time series forecasting in sql 9/10 Kim Berg Hansen Sunshade Sunshade
2014-11 2014-12
3 Forecast 3 Forecast
187 187
That can be represented in an easily readable graph:
many items. But she also wanted the results put into a table so that the forecast data could be used and reused for graphs and application and all sorts of things for the purchasing department. No problem, said I, I’ll create the table: create table forecast ( item varchar2(10) , mth date , qty number );
And then with one insert statement, I created the 2014 forecast of hundred thousand items in 1½ minute: insert into forecast with s1 as (...), s2 as (...), s3 as (...), s4 as (...), s5 as (...) select item , mth , t * s qty -- forecast from s5 where s5.ts >= 37; -- just 2014 data
She was happy that day. Figure 5: 2011-2013 sales and 2014 forecast for the two items
The massive apply But actually what my data analyst asked was, if I could apply this model to a hundred thousand items. Yes, the SQL above works nicely for that
106 – OTech Magazine #6 – winter 2014
Conclusion The original Time Series Analysis model was here built in Excel with a series of successive columns with formulas. Using successive with clauses and analytic functions to calculate centered moving average and regression, I could step by step transform the same model into a SQL statement
time series forecasting in sql 10/10 Kim Berg Hansen for very efficient forecasting of hundred thousand time series in one go. But this is just a single statistical model. There are many, many other variants of models. Surely I can write them in SQL for efficiency – but handwriting hundreds of models will be very tedious. In tools like R there are such models built in – next issue of OTech Magazine I’ll write about how to use R packages and the free ROracle driver to do forecasting using many statistical models. If you see the presentation I gave at ODTUG Kscope14 in Seattle: http://www.slideshare.net/KimBergHansen/analytic-functions-advancedcases, you can find a comparison of this method with another simple and crude model I built with regr_slope. The scripts can be found here: http:// goo.gl/q1YJRL
107 – OTech Magazine #6 – winter 2014
call for content OTech Magazine’s goal is to offer an insight into Oracle technologies and the way they are put into action. OTech Magazine publishes news stories, credible rumors and how-to’s covering a variety of topics. As a trustworthy technology magazine, OTech Magazine provides opinion and analysis on the news in addition to the facts.
cover topics relevant to professions such as software architects, developers, designers and other. OTech Magazine‘s authors are considered to be the best Oracle professionals in the world. Only selected, high-quality articles will make the magazine. Our editors are known worldwide for their knowledge in the Oracle field.
Our readership is made up of professionals who work with Oracle and Oracle related technologies on a daily basis. In addition we
Do you want to share your – Oracle – story to the world? Please fill in our Call-for-Content or contact Editor-in-Chief Douwe Pieter van den Bos.
108 – OTech Magazine #6 – winter 2014
Douwe Pieter van den Bos +31 6 149 143 43 douwepieter@otechmag.com
OTech Magazine
OTech Magazine is an independent magazine for Oracle professionals. OTech Magazine‘s goal is to offer a clear perspective on Oracle technologies and the way they are put into action. OTech Magazine publishes news stories, credible rumors and how-to’s covering a variety of topics. As a trusted technology magazine, OTech Magazine provides opinion and analysis on the news in addition to the facts. OTech Magazine is a trusted source for news, information and analysis about Oracle and its products. Our readership is made up of professionals who work with Oracle and Oracle related technologies on a daily basis, in addition we cover topics relevant to niches like software architects, developers, designers and others. OTech Magazine‘s writers are considered the top of the Oracle professionals in the world. Only selected and high-quality articles will make the magazine. Our editors are trusted worldwide for their knowledge in the Oracle field. OTech Magazine will be published four times a year, every season once. In the fast, internet driven world it’s hard to keep track of what’s important and what’s not. OTech Magazine will help the Oracle professional keep focus. OTech Magazine will always be available free of charge. Therefore the digital edition of the magazine will be published on the web. OTech Magazine is an initiative of Douwe Pieter van den Bos. Please note our terms and our privacy policy at www.otechmag.com.
Independence
OTech Magazine is an independent magazine. We are not affiliated, associated, authorized, endorsed by, or in any way officially connected with The Oracle Corporation or any of its subsidiaries or its affiliates. The official Oracle web site is available at www.oracle. com. All Oracle software, logo’s etc. are registered trademarks of the Oracle Corporation. All other company and product names are trademarks or registered trademarks of their respective companies.
In other words: we are not Oracle, Oracle is Oracle. We are OTech Magazine.
109 – OTech Magazine #3 – May 2014
Authors
Why would you like to be published in OTech Magazine? - Credibility. OTech Magazine only publishes stories of the best-ofthebest of the Oracle technology professionals. Therefore, if you publish with us, you are the best-of-the-best. - Quality. Only selected articles make it to OTech Magazine. Therefore, your article must be of high quality. - Reach. Our readers are highly interested in in the opinion of the best Oracle professionals in the world. And all around the world are our readers. They will appreciate your views. OTech Magazine is always looking for the best of the best of the Oracle technology professionals to write articles. Because we only want to offer high-quality information, background stories, best-practices or how-to’s to our readers we also need the best of the best. Do you want to be part of the select few who write for OTech Magazine? Review our ‘writers guidelines’ and submit a proposal today at www. otechmag.com.
Advertisement
In this first issue of OTech Magazine there are no advertisements placed. For now, this was solely a hobby-project. In the future, to make sure the digital edition of OTech Magazine will still be available free of charge, we will add advertisements. Are you willing to participate with us? Contact us on www.otechmag.com or +31614914343.
Intellectual Property
OTech Magazine and otechmag.com are trademarks that you may not use without written permission of OTech Magazine. The contents of otechmag.com and each issue of OTech Magazine, including all text and photography, are the intellectual property of OTech Magazine. You may retrieve and display content from this website on a computer screen, print individual pages on paper (but not photocopy them), and store such pages in electronic form on disk (but not on any server or other storage device connected to a network) for your own personal, non-commercial use. You may not make commercial or other unauthorized use, by publication, distribution, or performance without the permission of OTech Magazine. To do so without permission is a violation of copyright law.
All content is the sole responsibility of the authors. This includes all text and images. Although OTech Magazine does it’s best to prevent copyright violations, we cannot be held responsible for infringement of any rights whatsoever. The opinions stated by authors are their own and cannot be related in any way to OTech Magazine.
Programs and Code Samples
OTech Magazine and otechmag.com could contain technical inaccuracies or typographical errors. Also, illustrations contained herein may show prototype equipment. Your system configuration may differ slightly. The website and magazine contains small programs and code samples that are furnished as simple examples to provide an illustration. These examples have not been thoroughly tested under all conditions. otechmag.com, therefore, cannot guarantee or imply reliability, serviceability or function of these programs and code samples. All programs and code samples contained herein are provided to you AS IS. IMPLIED WARRANTIES OF MERCHANTABILITY, NONINFRINGEMENT AND FITNESS FOR A PARTICULAR PURPOSE ARE EXPRESSLY DISCLAIMED.
OTECH MAGAZINE
Have a great Christmas. See you in 2015.