Review & PResentation
Radio fRequency identification & data ManageMent
Temporal Management of RFID by Fusheng Wang, Peiya Liu Warehousing and analyzing massive rfid data sets by Hector Gonzalez Jiawei Han, et al. Cost-Conscious Cleaning of Massive RFID Data Sets by Hector Gonzalez Jiawei Han, et al.
COP 6731 – For Dr. Kien A. Hua Presented by Rahat
Antenna IC Chip Substrate Connection
outline
RFID & Discussion on Papers
outline
“So called Nucleus of Enterprise supply chain management ”
ReseaRch PRobleM
Characteristic aspect of RFID data management & how it enables the scholars to solve out the diverse challenges
intRoductoRy woRds: “ An Important Technology”
Tracking Technologies and Automatic ID Systems Various technologies are used to track and automatically ID people, products, and other objects – Barcodes – Optical Character Recognition (OCR) – Biometrics • Voice recognition and ID systems • Fingerprint ID systems
– Smart cards – Memory cards – Microprocessor cards
Rfid: what is it ? RFID combines many of the features of several of these technologies – Like barcodes, RFID is used to identify and track objects – As with OCR and biometrics, RFID enables automatic ID and verification – RFID also can be used like smart cards, memory card, and microprocessor cards to store information and provide interactive data processing
Most RFID tags contain at least two parts. One is an integrated circuit for storing and processing information, modulating and demodulating a (RF) signal, and other specialized functions. The second is an antenna for receiving and transmitting the signal.
how is Rfid unique? – It can be used to accurately locate and identify objects from a distance using RF signals – It can be used to detect and read objects that are not in line of sight –Data can be interactively managed and processed by the RFID chip and RFID system
MaRket & aPPlication Industrial Products
Logistics/ Trans.
Consumer Products
Retail Products
Homeland Security
Key Industry Drivers Leading Us Toward RFID
Other Service
exaMPle: Rfid in libRaRies
Simplifies checkout process for staff
Inventory of Collections
Use with new and future technology
Item Security
Express checkout for patrons
benefits of Rfid •Automatic reads •Active chips can be written •Many chips can be read simultaneously •Standardized and unique encoding •Better process specific data collection
innovative aPPlications SUPPLY CHAIN TRACKING RETAIL AND INVENTORY MANAGEMENT BAGGAGE HANDLING CREDIT CARDS HEALTH CARE ID AND MEDICAL DATA SMART PASSPORTS IMPORT/EXPORT PROCESSES AUTO ID FOR TOLLS, IGNITION, PARKING CHILD AND PET TRACKING
cuRRent technology
thRee Main PaPeRs RFID & Discussion
& the suPPoRting cast
Managing Rfid data (vldb2004) Sudarshan S. Chawathe, Venkat Krishnamurthy et al. • Not the Supporting Paper • The authors presented a brief introduction to RFID technology and highlight a few of the data management challenges
• A layered architecture – RFID tags – Tag readers – Savant/Middleware • Mapping the low-level data stream form readers to a more manageable from that is suitable for application-level interactions
– EPC-IS • Combing business logic with the stream of data emerging from the sensing framework below them
– ONS • Essentially a global lookup service
chaRacteRistics of Rfid data Three main Problems came out
Probable Solution
Large volume – A retail with 3000 stores sells 10,000 items a day per store
(EPC, location, time) Each item 10 traces before leaving store How manly tuples it will generate each day? 10,000 ×10 ×3,000=300,000,000 (without redundancy)
Model and storage of RFID data
– Walmart is expected to generate 7 terabytes of RFID data per day
Implicit semantics – Observations imply location changes, aggregations, and business processes
Inaccurate data – Noisy data and duplicate readings
Query and data mining of RFID data Data cleaning of RFID data
laRge voluMe Papers on Problem One
• Mention one PaPeR on Bitmap Datatype • coveR one of the Main thRee PaPeR Temporal Management
suPPoRting Rfid-based iteM tRacking aPPlications in oRacle dbMs using a bitMaP datatyPe(vldb2005) Observation Ying Hu, Seema Sundara et al.
– Groups of items in the same proximity e.g. on a shelf, on a shipment – Groups of items with same property e.g. Same product
Main Idea Instead of storing a tuple per item store a tuple for all the items having same prefix.
EPC BITMAP SEGMENT DATATYPE – A new type to represent a collection of EPCs with a common prefix
In Bulk Load Performance & Storage comparison, it works very good
featuRed PaPeR 1 of 3 Temporal Management of RFID by Fusheng Wang, Peiya Liu
teMPoRal ManageMent of Rfid data (vldb2005) Fusheng Wang, Peiya Liu
Authors Proposed by Wang and Liu from Siemens
Senior Member of Technical Staff »Siemens Corporate Research, Inc. Multimedia Documentation Program »Princeton, New Jersey »USA Email: pliu@scr.siemens.com
teMPoRal ManageMent of Rfid data (vldb2005) Fusheng Wang, Peiya Liu
Observation • RFID entities are static and are not altered & RFID relationships are dynamic and change all the time • RFID data are time-dependent in large volumes • RFID data management systems need to effectively support such large scale temporal data created by RFID applications. These systems need to have an explicit temporal data model for RFID data to support tracking and monitoring queries. • In addition, they need to have an automatic method to transform the primitive observations from RFID readers into derived data used in RFID-enabled applications.
teMPoRal ManageMent of Rfid data (vldb2005) Fusheng Wang, Peiya Liu
In this paper, the authors present an integrated RFID data management system Dynamic Relationship ER Model Two types of dynamic relationships added: Event-based dynamic relationship:
A timestamp attribute added to represent the occurring timestamp of the event. State-based dynamic relationship:
•
tstart and tend attributes added to represent the lifespan of a state.
Static entity table
– OBJECT (object_epc, name, description) – LOCATION (location_id, name, owner)
•
– SENSOR (sensor_epc, name, description) – TRANSACTION (transaction_id, transaction_type)
Dynamic relationship tables – CONTAINMENT(epc, parent_epc, tstart, tend) – OBSERVATION(sensor_epc, value, timestamp) – OBJECTLOCATION(epc, location_id, tstart, tend) – SENSORLOCATION(sensor epc, location id,position, In the paper there are number of nice but complex example, also – TRANSACTIONITEM(transaction_id, epc, timestamp) tstart, tend)
1. Siemens RFID Middleware Architecture 2. Rules-based RFID Data Transformation
teMPoRal ManageMent of Rfid data (vldb2005) Fusheng Wang, Peiya Liu
Advantage Provides powerful query support of RFID object tracking and monitoring Can adapt to different RFID-enabled applications Enables semantic RFID data filtering Automatic data transformation based on declarative rules Is a powerful and realistic model of RFID as it integrates business processes into the data model itself
iMPlicit seMantics Papers on Problem Two
Mention intRoduction and keynote on Warehousing &
ining coveR one of the Main thRee PaPeR on
Warehousing
waRehousing 101 Huge data sets, terabytes generated each day We need OLAP to make sense of the data Traditional data cubes don’t work. A data cube only provides aggregates for a given combination of dimension values. We need aggregates at the path level. Architecture of the RFID Warehouse is based on these Key ideas of RFID data compression
Taking advantage of data generalization Taking advantage of bulky object movements Taking advantage of the merge and/or collapse of path segments
waRehousing 101 •
Lossless compression
– Remove redundancy: (r1,l1,t1) (r1,l1,t2) ... (r1,l1,t10) => (r1,l1,t1,t10) – Group objects that move and stay together. •
Data cleaning: Multi-reading, missed-reading, error-reading, bulky movement.
•
Data mining: Find trends, outliers, frequent, sequential, flow patterns.
•
Multi-dimensional summary: product, location, time, …
•
Query Processing
– Support for OLAP: roll-up, drill-down, slice, and dice
the big PictuRe
waRehousing and Mining Massive Rfid data sets Keynote for ADMA2006 - Jiawei Han
featuRed PaPeR 2 of 3
Warehousing and analyzing massive rfid data sets by Hector Gonzalez, Jiawei Han, et al.
waRehousing and analyzing Massive Rfid data sets (icde2006) Hector Gonzalez, Jiawei Han, Xiaolei Li & Diego Klabjan
Authors
waRehousing and analyzing Massive Rfid data sets (icde2006) Hector Gonzalez, Jiawei Han, Xiaolei Li & Diego Klabjan
Motivation (1) Items usually move together in large groups through early stages in the system (e.g., distribution centers) and only in later stages (e.g., stores) do they move in smaller groups (2) Although RFID data is registered at the primitive level, data analysis usually takes place at a higher abstraction level.
As a departure from the traditional data cube, the authors propose a new warehousing model It preserves object transitions while providing significant compression and path-dependent aggregates Techniques for summarizing and indexing data, and methods for processing a variety of queries based on this framework are developed in this study.
waRehousing and analyzing Massive Rfid data sets (icde2006) Hector Gonzalez, Jiawei Han, Xiaolei Li & Diego Klabjan
Advantages Allows high-level analysis to be performed efficiently and flexibly in multidimensional space. The model is composed of a hierarchy of highly compact summaries (RFID-Cuboids) of the RFID data aggregated at different abstraction levels Efficient answering of a wide range of RFID queries Collapse multiple movements into a single record without loss of information.
Taking advantage of data generalization Taking advantage of bulky object movements Taking advantage of the merge and/or collapse of path segments
waRehousing 101 FlowGraphs •Tree shaped workflow •Captures main trends and significant deviations
FlowGraph Cubing •Data Cube where each cell is a FlowGraph. •The FlowCube goes beyond the traditional Data Cube with scalar aggregates, and adds a path view of the data.
Why FlowGraphs Compact summary of popular paths traversed by items Highlights important deviations from popular paths
otheR woRk by the dais University of Illinois
FlowCube:Constructing RFID FlowCube for Muti-Dimensional Analysis of Commodity Flows,VLDB2006 Mining Compressed Commodity Workflows From Massive RFID Data Sets, CIKM’06 Cost-Conscious Cleaning of Massive RFID Data Sets, ICDE’07
Notice that their proposal of the RFID model and its subsequent methods for warehouse construction and query analysis is based on the assumption that RFID data tend to move together in bulky mode, especially at the early stage. This fits a good number of RFID applications but not all. So further study can be done
inaccuRate data Papers on Problem Three
Mention thRee PaPeR on Data cleaning coveR one of the Main thRee PaPeR on Cost-Conscious Cleaning
issues in data cleaning False negative reading • In this case, RFID tags might not be read by the reader at all while present to a reader • Caused by – RFID readers capture only 60-70% of all tags that are in the vicinity – RF collisions – Water or metal shielding
False positive reading • In this case, besides RFID tags to be read, additional unexpected reading are generated • Caused by – RFID tags outside the normal reading scope of a reader are captured by the reader – RFID tags has moved away its vicinity, but reader fails to capture it
Duplicate Readings • Caused by – Tags in the scope of a reader for a long time are read by the reader multiple times – Multiple readers are installed to cover larger area or distance, and tags in the overlapped areas read by multiple readers – To enhance reading accuracy, multiple tags with same EPCs are attached to the same object, thus generate duplicate readings
Logical anomalies: tend to be application dependent
data cleaning PaPeRs A Pipelined Framework for Online Cleaning of Sensor Data Streams, ICDE 2006(short paper) Adaptive Cleaning for RFID Data Streams_VLDB2006 ShawnR. Jeffery, Minos Garofalakis, Michael J.Franklin Efficiently Filtering RFID Data Streams_VLDBCleanDB2006 Yijian Bai, Fusheng Wang, Peiya Liu
Existing cleaning techniques have focused on the accurate methods, but have disregarded the very high cost of cleaning in a real application
featuRed PaPeR 3 of 3 Cost-Conscious Cleaning of Massive RFID Data Sets, ICDE’07
cost-conscious cleaning of Massive Rfid data sets, icde’07 Hector Gonzalez, Jiawei Han & Xiaolei Li
Authors
cost-conscious cleaning of Massive Rfid data sets, icde’07 Hector Gonzalez, Jiawei Han & Xiaolei Li
Observation & Motivation False negative reading False positive reading Duplicate Readings Logical anomalies: tend to be application dependent Existing cleaning techniques have disregarded the very high cost
Contribution: Propose a cleaning framework Identify the conditions under which a specific cleaning method A sequence of cleaning methods can be applied in order to minimize the expected cleaning costs, including error costs
cost-conscious cleaning of Massive Rfid data sets, icde’07 Hector Gonzalez, Jiawei Han & Xiaolei Li
Cleaning Plan Define a cost model Assign costs to cleaning methods (training cost, execution cost, maintenance cost, etc). Define error costs. Using training data determine the efficacy of training methods under different contexts. Optimization Problem: Construct a cleaning plan (methods to apply for different circumstances) that minimizes the total expected cleaning costs.
cost-conscious cleaning of Massive Rfid data sets, icde’07 Hector Gonzalez, Jiawei Han & Xiaolei Li
Advantages Development of accurate methods Other existing methods were well under a wide set of conditions as well they disregarded the cost Induces a cleaning plan that optimizes the overall accuracy-adjusted cleaning costs Simple Architecture of Cleaning Framework Cost-conscious framework that learns when and how to apply different cleaning techniques in order to optimize total cleaning costs and accuracy.
soMe thoughts & RefeRences RFID & Discussion
concluding ReMaRks Number of the papers propose some general and expressive temporal-oriented data model for RFID data The data models are shown to be quite powerful on supporting RFID data tracking and monitoring The rules-based framework enables automatic RFID data filtering, transformation, and aggregation, to generate semantic high level data The system can be adapted into different RFID applications, thus substantially reduces the cost of managing and integrating RFID data into business applications
society’s conceRns Privacy
- Tracking individuals - Illicit or inappropriate use of personal data - Tracking personal activities (e.g., purchase habits, travel)
Security
- Unsanctioned readers - Theft of information - Inadequate encryption
Global differences
- Regulations around collecting data - Standards - Ownership of data
futuRe woRks Privacy and security for the deployment of RFID. Tampering is big research topic Secure management of RFID data management XML-based Traceability of RFID data One notable paper came on which was bit diverse from the other papers was Integrating Automatic Data Acquisition with Business Processes Experiences with SAP’s AutoID Infrastructure in one of the VLDB Christof Bornhovd,Tao Lin,Stephan Haller,Joachim Schaper
Auto-ID infrastructure Open Issues Different Qualities of Service Distributed Smart Items Infrastructure Seamless Integration of Environmental Sensors In the next stage we expect to look into some of these issues
RefeRence Books RFID security Rockland, MA : Syngress, c2006 Frank Thornton ... [et al.]. RFID implementation New York : McGraw-Hill, c2007. Dennis E. Brown. RFID essentials Beijing ; Sebastopol, CA : O'Reilly, 2006 Bill Glover and Himanshu Bhatt. RFID for dummies Hoboken, N.J. ; Chichester : Wiley, 2005. by Patrick J. Sweeney.
Websites Managing RFID data http://portal.acm.org/citation.cfm?id=1316791 http://portal.acm.org/citation.cfm?id=1083592.1083723 http://www.informaworld.com/index/768428270.pdf http://portal.acm.org/citation.cfm?id=1107548.1107603 http://www.ingentaconnect.com/content/mcb/089/2003/00000031/00000010/art00005 Temporal management of RFID data http://portal.acm.org/citation.cfm?id=1083592.1083723 http://www.springerlink.com/index/f32p7n6t32q71703.pdf http://portal.acm.org/citation.cfm?id=1164127.1164143 http://millennium.cs.ucla.edu/~zaniolo/papers/ICDE07RFID.pdf http://www.itee.uq.edu.au/~xueli/RoozbehDerakhshan.pdf Mining compressed commodity workflows from massive RFID data sets http://portal.acm.org/citation.cfm?id=1183641 http://daisy.cs.uiuc.edu/hector/research.pdf http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4368141 http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4401085 http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4529801 Warehousing and Analyzing Massive RFID Data Sets http://xavier.ceng.calpoly.edu/ime312/RFID_warehouse.pdf http://portal.acm.org/citation.cfm?id=1083592.1083723 http://doi.ieeecomputersociety.org/10.1109/ICDE.2006.171 http://www.springerlink.com/index/r7y7wnmwqtfpx92t.pdf http://portal.acm.org/citation.cfm?id=1080148.1080164
back uP slide Rfid tags: Passive vs. active
Rfid in action
a veRy active tag
back uP slide
λ 0
λ 1
μ
λ 2
2μ
αλ
T
T+1
αλ
αλ
T+2
Tμ (T+1)μ (T+2)μ
αλ
C-1 (C-1)μ Cμ
the Guard channel policy rejects all new calls until the channel occupancy goes below threshold
C
soMe otheR thoughts Sky is still the limit but RFID did not reached half of its limit yet soMe links (if time permits)
RFID - Technology Video (Detailed) http://www.youtube.com/watch?v=4Zj7txoDxbE RFID Demonstration http://www.youtube.com/watch?v=FVmD4iTXRLE That´s how we will shop within a couple of years! http://www.youtube.com/watch?v=sDyqhcy1L-0 Future Store (Smart Check Out) http://www.youtube.com/watch?v=zBz3aoikLpU