My Genealogy Wish - A Handwriting Search Algorithm
We are all familiar with different ways to search on Ancestry and other genealogy database for our ancestors. We type in a surname and we have two options. First we may choose the "exact" search, hoping that the stars are aligned and the census enumerator wrote our ancestor's name correctly and then the indexer correctly read the handwriting. Then when that doesn't work, we may try the "soundex" search. Did the census enumerator write the surname as it sounded and not how we think it should be spelled?
But what options do we have when the census enumerator spelled the name correctly but the indexer got it wrong? Often we try searching the surname with just the first few letter and an asterisk (I could search for Eiswerth as "Eis*"). Or we add a question mark to note an unknown letter (search Bascom as "Basc?m:). But what if there was an easier way?
Here is where my wish comes into play. I wish there was a way to search based on handwriting. Just as soundex looks at letters that sound alike, I want a search algorithm that looks at how letter look alike.
Here are some of my ideas to get it started:
- What letters look the same? Capital Ls and Ss. They would have the same code, just like the soundex.
- How many humps does the letter have? All those Ns, Ms, Us, Rs, and other letters could be counted as humps. Then when you searched you would find sequences of letters with the same number of humps.
- Which direction do the stick parts of the letters go? Up like a "d" or down like a "p".
- What ways did letter groups get written? Double "s" looked like an "f".
What do you think about "handex"? (And I will call it that until someone thinks of something better!) Leave a comment and let me know.
What are your wishes for the genealogy community? I'm still looking for guest bloggers to add the the Summer of Genealogy Wishes and I would love to add your wish to the series. Send me an email to genwishlist@gmail.com.
I love the idea of handex! Since enumerators often knew the people they polled, they at least knew how the name was pronounced, whereas the indexers often seem not to be familiar with the area (some of the best transcriptions are by GenWeb volunteers - sometimes the GenWeb sites for a locality have their own transcriptions). The GenWeb site for Greenville, SC tracks families through the censuses and often even gives land information!
ReplyDeleteP.S. I will try to get a Genealogy Wish post to you later this month; I'l e-mail you when I know for certain that I can do it.
I'm all for letting you name it, Tina. I think "handex" is a great name!
ReplyDelete