With loan document packages often exceeding 500 pages and potentially including over 100 unique document types, the mortgage loan origination lifecycle is heavily dependent on accurate and efficient document classification. Loan documents come from multiple sources, including brokers, lenders, borrowers, employers, and online vendors, and the set of documents required varies from state to state and county to county. Inconsistent and potentially confusing nomenclature can make it difficult to categorize documents by hand. For example, “1003”, “mortgage application form”, “uniform residential loan application”, and “URLA” all refer to the same document.