PDA

View Full Version : GEDCOM development


John Yates
30 April 2008, 04:11 PM
The GEDCOM specification is not general enough to cover many features
today. It seems that there is no development on the specification. Shouldn't
there be a committee to periodically update the specification, and perhaps
elevate it to be a true standard that evolves with the field? And have members
in addition to The Church of Jesus Christ of Latter-day Saints?

I think Genealogists would like that to happen, but perhaps not the software
vendors who'd like to trap you into their product forever.

Thoughts?

Mary Arthur
30 April 2008, 07:49 PM
The GEDCOM specification is not general enough to cover many features
today. It seems that there is no development on the specification. Shouldn't
there be a committee to periodically update the specification, and perhaps
elevate it to be a true standard that evolves with the field? And have members
in addition to The Church of Jesus Christ of Latter-day Saints?

I think Genealogists would like that to happen, but perhaps not the software
vendors who'd like to trap you into their product forever.

Thoughts?

You are several years too late - Gedcom development stalled for many reasons, including lack of interest/support from LDS and the understanding that it was impossible.

John Yates
30 April 2008, 09:33 PM
"When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong." --Arthur C. Clarke

Having been a scientist, I agree with Clarke. The right people just haven't
worked on it yet if it appears impossible. As baby boomers retire and get
interested in Genealogy, I think this is going to get resolved at some point.

I'd bet that someone at Google could figure it out. The Internet itself would
look impossible if it hadn't happened. Largely by RFC specifications. And
Tim Berners-Lee, who invented the World Wide Web, is a Physicist.

A year or more ago I did lob in to Google the suggestion that aiding
the development of genealogy is needed and fits in with their stated
agenda of cataloging the worlds information. (they have an idea intake
site). Maybe someone there interested in Genealogy is already investing
his free one day a week on such an advancement. A little language where
general objects and their definitions can be specified in the GEDCOM code
(so it can be extended on the fly) could do it.

[See Wikipedia for "Request for Comments" if you are unfamiliar with RFCs
and interested].

GEDCOM needs an RFC.

jep111
01 May 2008, 04:41 AM
The GEDCOM specification is not general enough to cover many features
today. It seems that there is no development on the specification. Shouldn't
there be a committee to periodically update the specification, and perhaps
elevate it to be a true standard that evolves with the field? And have members
in addition to The Church of Jesus Christ of Latter-day Saints?

I think Genealogists would like that to happen, but perhaps not the software
vendors who'd like to trap you into their product forever.

Thoughts?

Where to start?

What organization do you propose to assume custody of the spec? GEDCOM exists because of the LDS church; left to Genealogists (who are they, anyway?) I feel it's safe to say no spec would have ever happened. Although I have no hard data to support this assertion, I would estimate that over half of the people who support themselves solely by work in the various fields of genealogy are working on the behalf of the LDS church.

There is an active discussion about what constitutes a "professional" genealogist, with no clear definition in sight. If the people doing it can't even define their roles, it is not reasonable to expect them to organize a large enough coherent muscle in the market place to elevate a spec to a "true standard."

So that leaves developers wanting to please their customers, and also develop features that will prove useful and thereby insure their success in the marketplace. Providing features unique to a specific program helps drive customers to that product. The very thing that spurs sales and keeps the developers in business (which I would argue is both great and necessary) makes their data sets less likely to be completely compatible with each other. Making GEDCOM a true standard would likely reduce competition by requiring compliance with one and only one view of the genealogical data set. Not something I hope for.

Lastly, I am not comfortable with your remark that software vendors want to trap you into their product forever. Not only because you are posting it on a forum graciously hosted by one of those vendors, but because you are implying that there is an alternative available that they are not willing to embrace. That is simply not so. What work you are able to do is always influenced by the tools you choose. I try not to hammer with a saw, and I try not to enter data into Reunion if my client requests a TMG data set. Since I do work for hire, I own most of the packages available (at least the ones clients have asked for). I do the work they require in the format they request, as that seems to me to provide the best experience for my customer. I do my personal research data in Reunion. I cannot conceive of wanting to move my data sets to another product, so in that sense I am "trapped"; yet I fail to see why Leister or any other developer needs to make sure that I can. Only the LDS had the marketplace muscle to make some form of GEDCOM de rigueur, and they achieved what they were hoping for.

I suggest that if you feel this is a cause worth carrying, you address an organization that you think could perhaps become the steward of the spec. Perhaps post to the APG list?

John

John Yates
01 May 2008, 01:57 PM
I'm new to serious genealogy, so I don't yet know who is advancing the field, nor how. But I have a scientific and information technology background, and would like to see genealogy done scientifically. In a decade, (2019) it is projected by Ray Kurzweil that a $1000 computer will be as smart as a human. Couple this with Google's mission of cataloging the worlds knowledge (and others putting things on line as well), we are closing in on the day when computers will be able to do genealogy from on line sources by themselves. The biggest challenges to this will not be technology, but privacy and legal issues.

The technology should be built on the core of open standards built by the best and the brightest, and I suggest it is not too early to being development of such technical standards. Google could steward such a process. All I was looking for was if someone has been thinking for the future on this or not. I will continue asking around, and maybe see if I can dialog with Google about it if no one else is thinking about it and working on it.

The world wide web became the ubiquitous tool it is today because its inventor, Tim Berners-Lee *refused* to let it become proprietary. If he hadn't insisted on that, we'd probably have multiple proprietary Internets today, and none of them would be as useful as the single one we have.

I didn't mean to slight Reunion. I did a fair amount of homework before moving to it (I had FTM on Windows, and had fully invested in TMG before deciding that I wanted to only run Mac *native* applications, so I abandoned it). I found Reunion is one of the best. Open standards will not harm say the top five of "the best". But competition is what drives innovation and value for customers.

Actually, I really want to get back to building my genealogy database. The last year has been spent in trying out applications that I eventually abandoned, and learning from them, clubs, books, workshops, etc. I am now starting over with Reunion armed with the sourcing knowledge I've acquired over the past year. (5 people in Reunion so far while I build their sources. About 1500 in FTM, with several thousand more to add from genealogies published and found in Historical Centers). My hope is that standards at the core of what I do makes my work valuable to generations to follow, and to the computers that will do genealogy in the future.

jep111
02 May 2008, 05:01 AM
I'm new to serious genealogy, so I don't yet know who is advancing the field, nor how. But I have a scientific and information technology background, and would like to see genealogy done scientifically. In a decade, (2019) it is projected by Ray Kurzweil that a $1000 computer will be as smart as a human. Couple this with Google's mission of cataloging the worlds knowledge (and others putting things on line as well), we are closing in on the day when computers will be able to do genealogy from on line sources by themselves. The biggest challenges to this will not be technology, but privacy and legal issues.

For the record, my bachelor's degree is in Information Systems, so it is with a wink and a broad grin that I welcome the news that I only have to wait for 11 more years for computers to be bad genealogists in their own right.

If you honestly think that computers will someday soon even have access to all the available information that exists, you must have either vastly underestimated how many records humans have kept, or not understand how it long takes to put it into machine readable form. Even relatively minor data sets such as the FreeBMD project have taken untold man-hours, and they're only creating an index! Not to mention how the various entities that have ownership of records deny public access to those records. Lastly, records are transcribed by humans, making mistakes as we all do. GIGO.

The technology should be built on the core of open standards built by the best and the brightest, and I suggest it is not too early to being development of such technical standards. Google could steward such a process. All I was looking for was if someone has been thinking for the future on this or not. I will continue asking around, and maybe see if I can dialog with Google about it if no one else is thinking about it and working on it.

Perhaps you should take a look at WeRelate.org. It is a genealogy wiki; i.e. a community of contributors who donate their time and energy to enter data and edit others submissions into a hopefully cohesive and researched work. Maybe this is where the future lies? (Not for me: I personally enjoy the work. Last thing I want is for someone or something to do it for me).

I didn't mean to slight Reunion. I did a fair amount of homework before moving to it (I had FTM on Windows, and had fully invested in TMG before deciding that I wanted to only run Mac *native* applications, so I abandoned it). I found Reunion is one of the best. Open standards will not harm say the top five of "the best". But competition is what drives innovation and value for customers.

I still find it incongruous at best to suggest that the applications should play nice when even the operating systems don't. Why don't you demand instead that TMG become Mac native?

The companies that create genealogy software are all very small (except for FTM, easily the worst in the market, yet the largest. Never underestimate the value of distribution channels). I've spoken to most of the developers and to a person these are all people who are passionate about genealogy, and at least from what I've observed, all of them together probably don't take home what any NBA player does in a year. They develop because they have a vision. You can't both have competition driving innovation and value while at the same time requiring all programs to respond the same way to the same data sets. That leaves the developer with screen layout as the major way to differentiate his product (I'm buying Legacy; I love the way they use Arial!)

In one of my other aspects I'm a member of the Audio Engineering Society and I served on the Standing Committee for Forensics. I fully understand the value of standards. But I also understand business. The audio industry generates a large amount of money, and people and companies are willing to co-operate in areas that will create a bigger pie. The AES also only is effective because the member corporations allow it to be effective. (Think of the United Nations. If the US pulls its funding, does it still exist?) Even though the AES is the acknowledged keeper of the technical standards, there are plenty of VHS-Beta examples to be had. Even the world wide web can't agree on how to interpret HTML, and no one seems to be able to make Microsoft conform to any standards. The point here (finally!) is that there needs to be a financial "critical mass" to support standards organizations, as they basically are funded much like a little tax for the good of the industry. The only group in genealogy that had such financial wealth was the LDS church, and they developed GEDCOM until they either were satisfied or it no longer survived the cost/benefit analysis.

Actually, I really want to get back to building my genealogy database. The last year has been spent in trying out applications that I eventually abandoned, and learning from them, clubs, books, workshops, etc. I am now starting over with Reunion armed with the sourcing knowledge I've acquired over the past year. (5 people in Reunion so far while I build their sources. About 1500 in FTM, with several thousand more to add from genealogies published and found in Historical Centers). My hope is that standards at the core of what I do makes my work valuable to generations to follow, and to the computers that will do genealogy in the future.

I wish you wonderful fortune in your research. One last caution; genealogies published and found in FHCs are notoriously unreliable. Again, GIGO.

Cheers!

John

John Yates
02 May 2008, 01:42 PM
I agree it will take more than 11 years. But see Ray Kurzweil's books about the Law of Accelerating Returns. It is an exponential curve, and technology has now after thousands of years reached the knee of the curve. Moore's
Law of doubling every 18 months is just a special case of it. It doesn't matter when you double 1 MHz to 2 MHz, but when you reach doubling 2 GHz to
4, 8, 16, 32, etc., change comes rapidly! If it took 10 years to do the first full human genome, and the second 5 years, the third 2.5 years, 1.25, 0.625, and the cost is coming down by factors like 2 also, someday soon your full human genome will be inexpensive and fast to do. I'm watching genealogical DNA testing, and waiting, as the number of markers that can be done is going up all the time, and perhaps soon people can just do a full DNA! (and cheap).
Then comes the legal and ethical issues! But once technology is out of the box, it can't be put back in.

Scanning by Google is not done by humans, it is done by machines which could be called robots. Kurzweil also made some significant advances for OCR (Optical Character Recognition). He bases his predictions based on where a technology is in its exponential curve, not linearly like most people do. His books are a good read.

So, some time after the 11 years, and further robotic development, and further robotic development by the machines smarter than man, and machines that can devise and apply better critical data sanity checks than a human can do, well, you begin to see that the future holds great promise. We can't make the mistake of using today's technological understandings to ground what is possible in the future, nor how soon it will be here.

So I do think computers will have access to all written records sooner than we think. And Google and others are hard at work on it. Just as a human today can rank confidence in sources, so will computers, and better than the human brain can.

I do agree that today it is fun to hunt down genealogical resources. But as new tools become available, we abandon others. How often do we do Google searches today instead of going to libraries to look something up. We usually try it first, then go to the library only when necessary.

You mention GIGO. If humans can recognize garbage in, I submit that a computer, when it reaches the intelligence of humans, can recognize it better.

I did talk to the TMG developers about a Mac version. They replied that it was not in the offing as it is written in Visual Foxpro (I think, one of the MS Visual database offerings) and they had no intention of a rewrite.

Operating systems today do not play nice simply because the technology has not advanced sufficiently yet.

I understand business reasons often get in the way of a good product. Almost a decade ago, it was clear that MS did not care a whit about security because they were making money hand over fist anyway. They would only pay attention to security when it affected their bottom line. Which is what happened. Even if they still can't do a good job of it.

My scientific training deemphasizes monetary concerns for truth and beauty.
(elegant OSs and applications).

And I know that even publications in historical centers need to be vetted, they are only clues. Some document their primary references, some don't. Some have reputations that inspire confidence, some don't. Just like how we gain ways of knowing whether to believe Google search results or not, or to believe the person on the corner saying that the world will end in seven days.
And, I submit, in a decade or two, computers will have this skill.

Anyway, if you aren't familiar with Kurzweil's work, I urge you to have a look. I even think he is on the fringe with one of his books about a lifestyle intending to live forever (I'll let his work explain what he means, the body of course wears out, but replacements can potentially preserve an individuals consciousness so in a general sense you could be said to never die...)

Anyway, now, back to genealogy! :-)

jep111
02 May 2008, 06:40 PM
I agree it will take more than 11 years. But see Ray Kurzweil's books about the Law of Accelerating Returns. It is an exponential curve, and technology has now after thousands of years reached the knee of the curve. Moore's
Law of doubling every 18 months is just a special case of it. It doesn't matter when you double 1 MHz to 2 MHz, but when you reach doubling 2 GHz to
4, 8, 16, 32, etc., change comes rapidly! If it took 10 years to do the first full human genome, and the second 5 years, the third 2.5 years, 1.25, 0.625, and the cost is coming down by factors like 2 also, someday soon your full human genome will be inexpensive and fast to do. I'm watching genealogical DNA testing, and waiting, as the number of markers that can be done is going up all the time, and perhaps soon people can just do a full DNA! (and cheap).
Then comes the legal and ethical issues! But once technology is out of the box, it can't be put back in.

Scanning by Google is not done by humans, it is done by machines which could be called robots. Kurzweil also made some significant advances for OCR (Optical Character Recognition). He bases his predictions based on where a technology is in its exponential curve, not linearly like most people do. His books are a good read.

So, some time after the 11 years, and further robotic development, and further robotic development by the machines smarter than man, and machines that can devise and apply better critical data sanity checks than a human can do, well, you begin to see that the future holds great promise. We can't make the mistake of using today's technological understandings to ground what is possible in the future, nor how soon it will be here.


Let me assure I am well acquainted with Raymond Kurzweil and his work. I met him in the early '80's and still own several of his synthesizers. A brilliant man and a nut ( I mean in that in the nicest way possible).

But Moore's Law specifies that the number of transistors that can be put into an IC at the low standard cost will double every two years. Not that processor speed will double every 18 months, and indeed, it no longer does. Photolithography is reaching its limits, and "new" technologies such as photon-based computing (30-years-plus old) are not taking its place. Bubble memory was touted as the next great thing 15 years ago and still is nowhere to be seen in a commercial sense. OCR is better than 90% accurate with modern printed material properly lit, but non-existent for that 1713 Bishop's transcript in Latin. While you're at it, read Wirth's Law and realize the sad truth that software is getting slower at a faster rate than hardware is speeding up.

Like it or not, there are limits to everything. Much of our current world is amazing and I could not have envisioned the like in my youth. However, there are also lots of areas where very little real differences exist between that world and this. Cell phones no longer get smaller every year, in fact they are now growing slightly. Why? Because people have to be able to manipulate them comfortably, and our fingers aren't shrinking.

Still, I recognize the light of futurist faith in your ASCII and realize that this has gone far beyond any real discussion of the future of GEDCOM. I hope at least for any observers I've managed to explain the state of affairs. For you, my colleague John, I merely say "Dream On!"

Cheers!

John

John Yates
03 May 2008, 01:41 AM