7 minute read
Utility Cooperative Forum Comparing two lists or datasets in Microsoft Excel
Editor & Guest Writer Peggy Maranan, Ph.D. DEMCO Director, Finance 16262 Wax Road Greenwell Springs, LA 70739 Phone 225.262.3026 Cell: 239.887.0131 peggym@DEMCO.ORG
Many tasks that accountants and auditors perform require comparing two lists, or datasets, to find matching, non-matching, or missing values. Some of these tasks include bank statement reconciliations, general ledger, and account reconciliations. Auditors also compare lists or datasets for analysis to investigate anomalies or identify trends in data, as well as to provide audit evidence. The ability to analyze entire datasets instead of just samples can improve the quality of the audit. For example, it can aid in duplicate detection of transactional data, identify outliers or anomalies, or detect missing information.
In Excel, like many things, there are multiple ways to accomplish that task. Some of the methods include the use of conditional formatting, creating formulas, and the use of add-in’s such as Power Query. This article will examine some of these various methods and offer guidance on how to select the method that best fits the requirement at hand. It will not address all available methods. The article is also not meant to provide tutorials on these methods. Tutorials can be found on the internet and in various Excel books and publications. This article will describe the methods so that you know what functions are available, and then what to search for when looking for tutorials.
The techniques discussed in this article include: 1. Quick conditional formatting to compare two columns of data 2. Match Data using Row Difference
Technique 3. Row Difference using IF Condition 4. Matching data using the MATCH function 5. Range sizes in data set fluctuate – use
Tables 6. VLOOKUP and XLOOKUP formulas 7. Creating a composite column 8. Using Excel Power Query
1. Quick conditional formatting to compare two columns of data
This method might be the quickest and most simple method. It will allow you to highlight a cell or range of cells based upon defined criteria. A Duplicate Values setting box is available in the Conditional Formatting
drop down list, where you can define the formatting and selection of Duplicate or Unique values. Formatting of values identified can then be defined for both Duplicate or Unique values for datasets in both lists.
2. Match Data using Row Difference Technique
When comparing two lists of data, select both columns of data, press F5 key on the keyboard, select the “Go to special” dialog box. Then select “Row difference” from the options. Matching cells of data across the rows in the columns are in white color and unmatched cells appear in grey color.
3. Row Difference using IF Condition
The IF Condition formula states if there is a match in the row when comparing two lists of data. If there is a match, the result of the formula will be “Matching” and if not then “Not Matching”. The formula would look something like this: =IF(A2=B2, “Matching”, “Not Matching”). The formula would need to be copied down the row of cells.
4. Matching data using the MATCH function
The MATCH function is one of many lookup functions in Excel. A lookup value is defined, and the function finds the location of that value in a list of values, and then identifies matches. It will return a number based upon the position of the value in a lookup array or range of data. Where there is no match, the return value will be an #N/A. Further, the ISNUMBER formula can additionally be used to turn these location numbers into “True” or “False” matches, which can be used where the relative location is not specifically required.
If using Excel 365, there is further capability to include an array as the lookup value of MATCH and the results will populate as a separate list. An array formula is a formula that can perform multiple calculations on one or more items in a row or column of values, or a combination of rows and columns of values. This eliminates the need to copy a formula down the spreadsheet rows or across columns, and instead with just one formula identifying the array, matching values will be identified. This is also achieved through the use of the ISNUMBER formula. The only difference between the MATCH formula and the ISNUMBER formula is that the MATCH formula points to a cell as the lookup value whereas a range of cells is included in the ISNUMBER formula.
Excel 365 also introduced the new function XMATCH. The XMATCH formula replaces the old VLOOKUP function, and has additional power. This formula returns a relative position value as does the MATCH formula. But, with XMATCH, dynamic arrays can also be used to create the formula. Dynamic arrays are resizable arrays that can calculate automatically and return values into multiple cells based on a formula entered in a single cell. The XMATCH formula is very similar to the MATCH formula, except that you do not have to select 0 in the formula for an exact match because that is already built in as a default setting of the XMATCH formula. Finding values “not matching” can be then be determined by using the =NOT(ISNUMBER) formula applied to the same array of data.
5. Range sizes in data set fluctuate – use Tables
For instances where the number of data rows can change in data sets, either rows are added or deleted, these lists are considered dynamic. For dynamic data sets, formulas pointing to specific nondynamic data sets would need to be updated manually. Converting the list of data in the data set to Tables solves this issue. When rows are added or deleted in the table,
formulas including the table reference, will automatically reflect the change.
6. VLOOKUP and XLOOKUP formulas
VLOOKUP formula is used to return a corresponding value from a cell reference. If there is no corresponding value, an #N/A error will be returned. XLOOKUP was introduced in Excel 365. It does the same thing as the VLOOKUP, but you can also define the result if a value is not found.
7. Creating a composite column
When you have several columns of data in two datasets, creating a new column and combining the data into a string of information will create a unique value with which to compare against another set of data. This combined, or composite value, can be created using the CONCATENATE feature. Then, the two columns containing the combined values can be compared for matches.
8. Using Excel Power Query
Power Query is a free add-in for Excel 2010 and 2013 for Windows. It is a built-in feature for Excel 2016 for Windows. Once the add-in is installed, the Power Query tab will appear in the Excel Ribbon.
Power Query can be used to compare two lists to identify which items are duplicates, different, or missing from one list as compared to another.
Data for comparison can be brought into Excel from various sources. Examples include other databases, other data from Excel, the web, an Access database, a CSV file, your accounting system, etc. The steps to transform data from varying sources include: 1. You add your data sources (Excel tables,
CSV files, database tables, webpages, etc.) 2. Press buttons in the Power Query Editor window to transform your data. 3. Output that data to your worksheet or data model (PowerPivot) that is ready for pivot tables or reporting. (Acampora, 2020, para. 3)
Once the data has been transformed into Power Query, there are three join features applied to the tables of data that can help complete these comparisons: a. Inner – will return only those rows that are present in both tables, i.e. the rows that match or duplicate. b. Left Anti – will return rows that are present in the first table but not in the second table. c. Right Anti – will return rows that are present in the second table but not in the first table.
More detailed instructions for completing these comparisons in Power Query can be found at the following myonlinetraninghub website, which is one amongst several websites available on the internet providing instructions: https://www. myonlinetraininghub.com/excel-comparetwo-lists
For more robust applications that have not been identified in this article, there are some software applications available for purchase that can assist with these comparisons. These can be particularly helpful if there are data sets that are large or where there are multiple criteria needing to be applied in data comparison.
References
Acampora, A. (May 13, 2020). Power Query Overview: An Introduction to Excel’s Most Powerful Data Tool. Retrieved from the following website on August 31, 2021: https://www.excelcampus.com/power-tools/ power-query-overview/
Treacy, M. (March 29, 2016). Excel Compare Two Lists. Retrieved from the following website on August 31, 2021: https://www. myonlinetraininghub.com/excel-comparetwo-lists