PDA

View Full Version : "pruning" a GedCOM


drlott
24 August 2006, 12:42 PM
I have received a GEDCOM on which are families I do not wish to import. I wish to be selctive and "prune" out data that will not be imported to my family file.
The GEDCOM is sitting inside my Reunion application folder and I have a "working" family file ready to receive it. The GedCOM is a list of numbers followed by a list of titles and then info of either a name date or notation.
(570 pages) I have been told there are 3 files in teh GEDCOM and 2 are backups...how do I visually spot these backups and how do I find the "Crow-Foster" file in all of this?
Do I import it into a family file and then delete or what?

I have a second GedCOM of another family that may have some duplicates of
info I have. I have a "second working file" ready to receive this. Do I go back and forth between the family file and the working file to compare or is there a way to print out both Ged Com and family file and do a visual comparison or ?

David G. Kanter
24 August 2006, 04:34 PM
I have received a GEDCOM on which are families I do not wish to import. I wish to be selctive and "prune" out data that will not be imported to my family file. The GEDCOM is sitting inside my Reunion application folder and I have a "working" family file ready to receive it.For what it's worth, any time I'm working with data from another person--whether via GEDCOM or a Family File--I make a clone of the Family File which ultimately will be receiving that data. (File->Save a Copy with the Type set to "Clone (no records)") That ensures all my customizations are in place before the Import operation--which is particularly useful if I'll be importing a GEDCOM so, before I click the "Import" button, I can adjust the Optional Fields to better parse the data into my fields. Once I've "cleaned up" the imported data to match all my own data-entry protocols, then I'd do the Import of that Family File into a copy of my "master" Family File--having made sure I have a proven-to-be-good backup of my "master" Family File, too.

The GedCOM is a list of numbers followed by a list of titles and then info of either a name date or notation. (570 pages) I have been told there are 3 files in teh GEDCOM and 2 are backups...how do I visually spot these backups Each valid GEDCOM begins with the line "0 HEAD" and ends with the line "0 TRLR". You can open the file you got with any word processor and see whether the person did put multiple GEDCOMs into a single file. Not "nice" to have done that, in my opinion; should have sent each GEDCOM as a separate file. If they did, you can select each GEDCOM and do a copy-and-paste into a new document--remembering that when you save a GEDCOM it must be saved as a plain-text document. (That choice should be available to you as a "format" or "type" decision as you are doing the save.) Now unless the person who prepared that file also include something between the GEDCOMs (or perhaps in a "1 FILE" line within the GEDCOM) to identify which is the "real" one and, if different in content, which are the "backups", you may have a problem deciding which is which. If in doubt, I'd probably import each of the GEDCOMs into their own cloned-Family File and see whether you can find some different information in one that you've been told was newly added.

and how do I find the "Crow-Foster" file in all of this?You can search the contents of the GEDCOM for instances of those names, or anything else, when it's opened in the word processor, but finding what you want--and displayed in a more useful format--will be far more easily done once you have the information imported into a Family File and can let all of Reunion's Family Card, Index, Find, etc., features do it.

Do I import it into a family file and then delete or what?Yes, import the individual GEDCOM(s) into--I'm suggesting--a separate, standalone, clone(s) of your Family File and then you can restrict what would ultimately be imported into your primary Family File by Marking the records that you want in that now-populated clone and ensuring you enable the "Marked People" option when you do the Import. I also find it valuable to use the Automatic Flag option during the Import to assign a unique Flag to the imported records. That way, it's easy to differentiate what had been in the primary Family File (no such Flag) and what you've just imported (has that Flag). (Once you're finished with the combination of information in your primary Family File, you can always delete that Flag if you no longer need that aid to be able to easily find the imported records.)

I have a second GedCOM of another family that may have some duplicates of info I have. I have a "second working file" ready to receive this. Do I go back and forth between the family file and the working file to compare or is there a way to print out both Ged Com and family file and do a visual comparison or ?Just import that GEDCOM into another, standalone, clone of whichever Family File against which you wish to make the comparison. Then, as Reunion lets you have multiple Family Files open at the same time, you can more easily make the comparison between the two Family Files. After you understand what's new in that separate Family File, you can decide whether a selective import (as described above through the use of Marking) into your primary Family File--followed, perhaps, by using the Find->Match & Merge People operation--is the most efficient way to integrate the new information as opposed to, for example, just doing some copying-and-pasting into your primary Family File.

Bill McQuary
25 August 2006, 12:02 PM
Do I import it into a family file and then delete or what?

My advice is to open any GEDCOM file you receive as a separate Reunion file. Then you can mark all the records you want to import into your master Reunion file without the risk of duplicating individuals.

Save the newly created family file after assigning it a name that identifies its creator, i.e., J. B. Brown's Smith family. Cite it as a source on the appropriate records in your master Reunion file.

As a general rule I would not import someone else's GEDCOM date directly into my Reunion family file.

Regards,
Bill

STEVE
25 August 2006, 06:50 PM
I have a slightly different way of dealing with imported GEDCOMs. Or, more like a different way of perceiving them. When I receive a GEDCOM, I import it as a separate Reunion database. It is named, and then added to my source list just as if it were a book I'd bought. A master copy is compressed and archived. A working copy remains in my genealogy folder. I then use the GEDCOM exactly as I would any other digitized resource. This is more a matter of perception than of actual usage. But I think it a crucial difference.

To me, merging two databases is, at best, problematic. At worst, a total disaster. Keeping the two databases separate and using the "import" as a source means that I can work on merging the data for one person who appears in both databases or add whole branches that appear in one, but not the other. It also means that I am never in the position of not being able to switch tasks fairly quickly just because I have a merge in progress.

The risk of confusion and/or muddling of the data is much decreased (at least for me and my weak short-term memory) and some very clear breakpoints become very evident. As well, if I find a major discrepancy in the data all I need do is mark and notate both files for that person and wait for some form of conformation or distinction before attempting to merge. This way I avoid any more confusion and double-entry beyond "This person also appears in 'database x' with different data." At another, perhaps more convenient, time I can work on resolving the differences.

There is no difference in how you resolve issues here. All the guidelines work just the same. The difference here is one of how the databases themselves are perceived. I hope this helps.

STEVE