Data-driven versus Pedigree-driven

The Millington One-Name Study (ONS) is classified as large by the Guild of One-Name Studies (there are 4,735 entries in the 1881 census covering England, Wales and Scotland). By 1911 there are 6,925 entries and to date there are 7,434 entries in the 1939 register. Add to this some 63,000 entries from civil registration and I am faced with a significant logistical problem – how best to start organising these in to pedigrees.

The traditional, pedigree driven, approach to genealogy is to start with what you know and work backwards, to a marriage and then a baptism or a birth – and on to the next generation (not forgetting births and burials). I believe that this approach works well with small and medium sized ONSs where the set of data is relatively small.

However, I have found that for my study it does not work well for two reasons:

  • a tendency to get distracted on a particular line;
  • too many similar names leading to a brick wall sooner or later. There is then a tendency to try and force a match perhaps a little too hard leading to poor quality pedigrees.

My alternative approach, which I have termed data-driven, is to work through sets of data one record at a time, building as many connections from that record as you can. In that way small pedigrees start evolving, slowly at first, but hopefully more quickly as more connections are made.

Some data sets serve this approach better than others. Census records (and I include the 1939 register) are an excellent start because in many cases a collection of individuals who are already organised in a family.

So I might start by going through each census record trying to match any of the individuals listed to a birth registration. If successful, this will (hopefully) give me a mothers maiden name which might help identify a marriage for the parents. And repeat … 🙂 for all of the census records. Slowly, slowly, the number of people linked grows.

I suspect a different approach might be required when I start to get seriously into the pre-1837 data, but for now it as worked out well for the censuses from 1841 through to 1911 and is going well for the 1939 register.

I would be interested in know how other operators of large ONSs carry out the family reconstruction aspects of their study.