ImportDiff Utility

Topics covered on this help page include:


Introduction

This utility can be used to compare the contents of two files, file1 and file2, and create an output file that contains only those entries from file2 that do not have an exact match in file1.  If you regularly use the Unanet imports to synchronize your Unanet data with an external system, and do not have the ability to export only new transactions from the external system, you might be feeding Unanet with a complete reload of data with each run.  This utility can be used to reduce your import volume by comparing the contents of yesterday's load file with today's load file and generating a load file containing only the changed data.

The importDiff utility is located in the unanet/utilities directory, and is named ImportDiff.jar. To run the utility, simply make the directory containing the ImportDiff.jar file your current directory and type:

   java -jar ImportDiff.jar

This will simply display the utility's usage message.


Requirements

You must have Java 1.8 or higher to use the ImportDiff utility.


Syntax and Options

usage:  java  -jar ImportDiff.jar  file1 file2 outFile

file1 The file to which you will compare the second file.

file2 For each record in file2, you will check to see if a corresponding -- exact match -- exists in file1.

outFile Any of the records from file2 that did not have an exact match in file1 will be written to this output file.


Reading The Output

The importDiff utility produces a set of record counts which are written to standard output.  If you are using the importDiff utility in an unattended situation, you might wish to capture this output.

A sample session (using 100,000 line input files) might look like this:

C:\unanet\utilities> java -jar ImportDiff.jar C:\tmp\file1.csv C:\tmp\file2.csv C:\tmp\import.csv

     File 1 line count: 100000

     File 2 line count: 100000

Output File line count: 100

The importDiff utility will return 0 if it runs successfully, and non-zero if it failed for a detectable reason.  Failures will also output a reason for the failure on standard error output.


Memory Requirements

The ImportDiff utility compares the contents of the two files by completely indexing the contents of the first file and then comparing each line of the second file against the index.  While this index may grow large, the default JVM heap size should be sufficient to handle most file comparisons.  If you are comparing very large files with millions of import records, you may encounter a java OutOfMemoryError.  This situation can be overcome by allowing the JVM to use a large heap size while running the comparison.

You should consult the documentation for your JVM for the specific method required to allocate more heap space. The following provides an example using the JavaTM 2 Runtime Environment, Standard Edition available from Sun:

C:\unanet\utilities> java -Xmx256M -jar ImportDiff.jar C:\tmp\file1.csv C:\tmp\file2.csv C:\tmp\import.csv

     File 1 line count: 2450000

     File 2 line count: 2450000

Output File line count: 1276


Running The Diff Utility from Another Machine

You must have a copy of the ImportDiff.jar file on the machine from which you will run the utility program (or have access to that file from the machine you are running from).  This file is delivered with the Unanet software and resides in the "unanet/utilities" directory.

Also, you must have Java 1.8 or higher installed on the machine.

Note: You must use the copy of the ImportDiff.jar file packaged with each release.  When migrating from one major release to the next major release, you'll need to make sure you copy the new ImportDiff.jar file to your remote machine.

Related Topics