Identifying Duplicates

You may be familiar with the routine when you have multiple duplicate records in your database. You end up having to reference multiple sources. Which can be a real pain. Ultimately, you end up needing to decide which record is the master and then merging as much of the data as possible from any of the duplicates.

This is a common scenario with any multi-user database system. Where, anytime you have more than one person working on the same data, you can inevitably end up with duplicates.

There are a few ways to deal with duplicates. The first is to try to prevent them in the first place. While this is certainly possible, there are other times where routine imports come from external sources or you've simply allowed duplicates to be entered into the database.

The question now is, "How do you identify duplicates?" and how do you address the fact that our eyes can trick us when we're looking at data which appears exactly the same - yet it's not. This video will show you how to truly approach duplicates and how to deal with them by giving you full control over defining what determines a duplicate or not.

AttachmentSize
IdentifyingDuplicates.zip1.69 MB
Tags:

Comments

Much needed for journeymen like me

thanks

Hi Matt
I have tried formatting data using auto enter Upper to force imported data to Upper case; however, the Hash looks at the data as entered and gives a different return for upper or lower case text.

We often get sales guys copying data from the web and pasting it in to fields causing records that are effectively duplicates to look unique.

Is there a way for me to take care of this?

Thanks
Andrew

Are you making sure to uncheck the "Do not replace existing" option. Using something like this on an auto enter field.

Lower ( TextFormatRemove ( Self ) )

Should be generating the same hash signature no matter what.

-- Matt Petrowsky - ISO FileMaker Magazine Editor

Matt,
This solved the problem when creating records, I still have to find a way to clean imported records.
We have customers that have the same Company name and ZIP code but their Address first line changes. I use the Name, Address First Line and ZIP to check for duplicates so plenty of chances for spurious characters!
Many Thanks for the help