Approaching Real-Time Business Intelligence Trading at the Speed of Light Sean McClure, Ph.D. Business Analytics, Excellerate Inc.
smcclure@excellerate4success.com
www.excellerate4success.com
Overview Themes Introducing Excellerate Real-Time BI and HFT
Information at High Frequency Strategies at High Frequency
Metrics
Data Mining Big Data
Developing and Deploying Models Meta Data
Executing and Monitoring Real-Time Systems Prediction
Summary
www.excellerate4success.com
About Us!
Business Intelligence Service Providers • Dedicated to bringing top quality business intelligence expertise to successful growing organizations (SGOs);
• Aggressively researching industry best practices and best-in-breed software tools to deliver high-end analytics and data mining expertise; • Business Intelligence model supported by Subject Matter Experts (SMEs) in key business areas. www.excellerate4success.com
Real -Time Business Intelligence Defining “Real-Time” Three types of latency1: • Data latency: time taken to collect and store the data;
• Analysis latency: time taken to analyze the data and turn it into actionable information; and • Action latency: the time taken to react to the information and take action.
Approaching “zero” latency • Real-time business intelligence technologies are designed to reduce all three latencies as close to zero as possible;
• Traditional BI only seeks to reduce data latency. www.excellerate4success.com
Real -Time Business Intelligence Real Time BI in various industries Debit and Credit Fraud Detection Marketing
Inventory Control
1
Supply-chain Optimization Customer relationship management (CRM) Dynamic pricing and yield management Data validation Operational intelligence and risk management
2
Call center optimization Transportation industry Finance (biggest candidates)
www.excellerate4success.com
3
High Frequency Trading (HFT) Best Case Study for “Real-Time” Intelligence
• Uses complex algorithms to analyze multiple markets and execute orders based on market conditions; • Traders with fastest execution speeds will be more profitable than traders with slower execution speeds (arbitrage opportunities).
High
Execution Latency
• Trading platform that uses powerful computers to transact a large number of orders at very fast speeds;
Traditional longterm Investing
Highfrequency Trading
Algorithmic/ electronic trading
Low Short
In the U.S., high-frequency trading accounts for www.excellerate4success.com ~73% of all equity trading volume5
Position Holding Period
Figure 1
Long
High Frequency Information Properties of Tick Data – Quote, Trade, Price and Volume Information
Date/time quote originated (>20ms)
• Timestamp • Security ID • Bid Price • Ask Price • Available bid volume • Available ask volume • Last trade price • Last trade size • Option-specific data www.excellerate4success.com
Highest price available for sale of the security
Provided by other market participants through limit orders
Lowest price entered for buying the security
Total demand Total supply
Price at which the last trade in the security cleared Actual size of the last executed trade
High Frequency Information Recent microstructure research and advances in econometric modeling tell us there are unique characteristics to tick data. irregularly spaced quotes arriving randomly very short time intervals
Tick Data time
(low-frequency data is opposite) Irregularities provide a wealth of information not available in lowfrequency data. Inter-trade durations may signal changes in market volatility, liquidity, and other variables. Volume of data allows for statistically precise inferences. Number of observations in a single day of tick data = 30 years of daily observations www.excellerate4success.com
High Frequency Information Modeling the Arrivals of Tick Data creates a host of opportunities not available at low-frequency • Time distance between quote arrivals carries information time
quote processes trade processes price processes volume processes
Duration models Estimate the factors affecting the duration between ticks High Trade Duration Higher likelihood of unobserved bad news
Low Trade Duration Higher likelihood of unobserved good news Absence of Trade Lack of news, low levels of liquidity, trading halts, www.excellerate4success.com trader motivations
Low Price Duration Increased Volatility
Low Volume Duration Increased Liquidity
High Frequency Information Data sampling methods overcome irregularities in high-frequency data for ease of processing Traditional Approach
Minute 1
Minute 2
Minute 3
Linear Time-Weighted Interpolation
Minute 1
Figure A quote
ˆt qt ,last q
Minute 2
Minute 3
Figure B
t tlast tnexttlast
qˆt qt ,last (qt ,next qt ,last )
Most modern computational techniques have been developed to work with regularly spaced data (easy to process) High frequency data-sampling methods developed to overcome irregularities in tick data by sampling at www.excellerate4success.com predetermined periods of time
High Frequency Information Security Price Adjustments to Information The price of the security in the inefficient market begins adjusting before/after the news becomes public ( “information leakage” and “overshooting”)
Many solid trading strategies exploit both the information leakage and overshooting to generate consistent profits Incorporation of information in efficient and inefficient markets Inefficient market response
Good News
Bad News
Efficient market response Information Arrival Time
Efficient market response
Information Arrival Time
Inefficient market response www.excellerate4success.com
High-Frequency Strategies Trading on High-Frequency Information Traders leverage state-of-the-art IT technology to implement trading strategies that have high-frequency opportunities; High-frequency trading strategies typically fall into four main categories. HFT-based Strategies
Electronic Liquidity Provision
Statistical Arbitrage
Liquidity Detection
Others
Spread Capturing
Market Neutral Arbitrage
Sniffing/Pinging/ Sniping
Latency Arbitrage
Rebates
Cross Asset, Cross Market & ETF
Quote Matching
Short Term Momentum
www.excellerate4success.com
High-Frequency Strategies Liquidity Provision Strategies - Spread Capturing Liquidity providers profit from the spread between bid and ask prices by continuously buying and selling securities; Executed predominantly using limit orders Ask
Asking Price Market Buy Orders
Bid-Ask Spread
Market Sell Orders
Market Price Limit Buy Orders
Bid
Limit Sell Orders
Offer Price
High-speed transmission of orders and low-latency execution required for successful implementation of liquidity provision strategies. www.excellerate4success.com
Market Transactions
High-Frequency Strategies Predictions based on the Limit Order Book Direction of market price movement
• Shape of limit order book is predictive of impending changes in market price;
• Exploited by market-maker traders; buy
sell
Direction of market price movement
www.excellerate4success.com buy sell
• Depends on probability distribution for arriving market orders; • Shape can be estimated when book not observable.
High-Frequency Strategies Statistical Arbitrage “Stat-Arb� rests squarely on data mining. It finds statistical relationships in large amounts of data and builds a model of those relationships; Leverages states of the art technology to profit from small and short-lived discrepancies between securities; Arbitrageurs generate profits by selling the asset on the market where it is valued higher and simultaneously buying it on another market where it is valued lower.
www.excellerate4success.com
High-Frequency Strategies Detecting Statistical Anomalies in Price Levels Once gap in prices reverse, close out position/stop loss Measure difference between prices of identified securities
Identify securities that trade in frequency unit
Sij ,t Si,t S j ,t ,t 1, T
Monitor and act upon differences in security prices
St Si , S j , ES 2 S
St Si , S j , ES 2 S
Estimate
1 T 2 distributional St S E S t t properties of the T 1 t 1 www.excellerate4success.com
difference
Select most stable relationships
1 T ESt St T t 1
2
min S T
i, j
t 1
ij ,t
High-Frequency Strategies Fundamental Arbitrage Strategies by Asset Class Asset Class
Fundamental Arbitrage Strategy
Foreign Exchange
Triangular Arbitrage
Foreign Exchange
Uncovered Interest Parity (UIP) Arbitrage
Equities
Different Equity Classes of the Same Issuer
Equities
Market Neutral Arbitrage
Equities
Liquidity Arbitrage
Equities
Large-to-Small Information Spillovers
Futures and the Underlying Asset
Basis Trading
Indexes and ETFs
Index Composition Arbitrage
Options
Volatility Curve Arbitrage
www.excellerate4success.com
Model Development/Deployment Model Development
Models used in HFT • Linear Econometric Models • Autoregressive (AR) Estimation • Moving Average (MA) Estimation • Autoregressive Moving Average (ARMA) • Cointegration Volatility Modeling • To model observed volatility clustering = ARMA or GARCH NonLinear Econometric Models Allows for modeling of complex nontrivial relationships in data • Taylor series expansion • Threshold autoregressive model • Markov switching model • Nonparametric estimation www.excellerate4success.com • Neural Networks
Ideas • Academic research and proprietary extensions
Tools • Modeling predominantly in Matlab /R, • c++ for back-tests and transition into production
Back Testing • Modeled relationships tested on lengthy spans of tick data • Forecasting validity • Various market situations
Model Development/Deployment Back-Testing Econometric Models
Point Forecasts
Directional Forecasts • makes decisions to enter into positions based on expectations of system going up or down (without target)
Accuracy Curves • compares the accuracy of probabilistic forecasts • HFT models done with TSA curves www.excellerate4success.com
Accuracy Curve Random Forecast 100 Model C
Model A Hit Rate (%)
• predict price will reach certain level /point • regression of realized values from historical data against out of sample forecasts
Model Accuracy Analysis
Model B 0.0 Miss Rate (%)
100 %
Executing Real-Time Systems Execution Optimization Algorithms • Algorithms spanning order-execution processes
• Designed to optimize trading execution once the buyand-sell decisions have been made elsewhere best way to route the order to the exchange best point in time to execute a submitted order (non-market order) best sequence of sizes in which the order should be optimally processed Common Types 1) 1) Market Aggressiveness Selection algorithms designed to choose between market and limit orders for optimal execution; 2) 1) Price-Scaling algorithms designed to select the best execution price according to the pre-specified trading benchmarks; and 3) 1) Size-optimization algorithms that determine the optimal ways to break down large trading lots into smaller parcels to minimize adverse costs (cost of market impact) www.excellerate4success.com
Executing Real-Time Systems Execution Optimization Algorithms Market Aggressiveness Selection • Balances passive and aggressive trading using optimization
min Cost ( ) Risk ( )
Cost ( ) Eo P( ) Pb Risk ( ) ( ( )) P( ) P f ( X , ) g ( X ) ( )
Pb
Benchmark execution price
Price-Scaling • Tries to obtain the best price for the strategy Strike Algorithm • Minimizes the cost of execution relative to a benchmark • Designed to capture gains in periods of favorable prices
min Et Pt 1 ( t ) Pb,t
2
t
P (a ) Realized execution price Pt 1 ( t ) Realized price (a ) Deviation of trading outcome t Trading aggressiveness P Market price at order entry Pb ,t Benchmark price f ( X , a ) Temp impact due to liquidity Plus Algorithm g ( X ) Price impact due to info leak www.excellerate4success.com
Size Optimization
Wealth Algorithm
• Tries to trade with position undetected • Large order packets are broken up for least amount of market impact (“Stealth Trading”)
Executing Real-Time Systems HFT Business Cycle
1
1 – 4: run-time 5 – 6: post-trade
Receive/archive realtime tick data on securities of interest
2
6 Ensure trading costs incurred during execution are within acceptable ranges
5 Evaluate trading performance relative to predetermined benchmarks
Each functions built with independent alert systems that notify monitoring personnel of problems, unusual patterns etc.
Apply back-tested econometric models to the tick data obtained in 1
3 Send orders and keep track of open positions/P&L values
4
Monitor run-time trading behavior, compare with predefined parameters, manage the run-time trading risk
www.excellerate4success.com
Summary Themes Introducing Excellerate
Metrics
Real-Time BI and HFT Data Mining
Information at High Frequency
Strategies at High Frequency
Big Data
Developing and Deploying Models
Meta Data
Executing and Monitoring Real-Time Systems
Prediction
www.excellerate4success.com
Thank You Sean McClure, Ph.D. Business Analytics, Excellerate Inc. smcclure@excellerate4success.com
www.excellerate4success.com
References 1) Richard Hackathorn, "Active Data Warehousing: From Nice to Necessary," Teradata Magazine (June 2006), AR-4835 2) cdn.avangate.com/web/images/articles/fraud-lock.jpg 3) genesissolutions.com/wp-content/uploads/2009/10/3.3.3-MROSupply-307x195.jpg 4) partnerc.com/images/iStock_000007068822Small.jpg 5) http://www.economist.com/node/5475381?story_id=E1_VQSVPRT
www.excellerate4success.com