From Research to Data with AI 2 of 5: Cleaning Up—Fixing, Formatting, and Validating Data

Andrew Redfern, Fiona Brooker
Mar 26, 2026
515 views
CC

About this webinar

Use AI to extract, clean, organise, and analyse your family history research. Intermediate level, focused on workflows and data handling; ideal for users managing large research projects; activities include table-building, clustering, and data cleaning.

About the speakers

Andrew Redfern is an enthusiastic family historian and accomplished speaker, having delivered presentations both in his home country of Australia and internationally over many years. His innovative wo...
Learn more...
Fiona Brooker is a professional genealogist (Memories In Time) who has been actively researching her family history for over 35 years, inspired by two marriage certificates and a collection of family ...
Learn more...

Key points and insights

In the second session of the "From Research to Data with AI" series, Fiona Brooker and Andrew Redfern provide a deep dive into the essential—though often overlooked—phase of cleaning, formatting, and validating genealogical data. While many researchers are quick to jump straight to AI-generated results, this webinar emphasizes that the quality of any family history output is only as good as the data that feeds it. By treating data extraction as a structured four-stage process—extract, clean, format, and validate—genealogists can move beyond simple transcriptions and toward a sophisticated digital archive. The session illustrates how to bridge the gap between traditional research methods and cutting-edge technology, ensuring that family trees remain accurate, consistent, and ready for future generations.


  • Advanced Prompting for Edge Cases: Success with AI transcriptions often depends on instructing the tool how to handle "edge cases," such as physical document damage, illegible handwriting, or ambiguous punctuation, ensuring the AI remains a faithful transcriber rather than a creative editor.
  • The Power of GEDCOM Standardization: Adhering to universal genealogical standards—such as using three-letter capitalized month abbreviations and specific date prefixes like "ABT" or "BEF"—is critical for maintaining data integrity when moving information between different AI tools and family tree software.
  • Cross-Platform Data Auditing: AI excels at "gap analysis," allowing researchers to compare reports from different platforms (like Ancestry and MyHeritage) to instantly identify missing birth dates, burial locations, or inconsistent occupation records that might otherwise be missed during manual reviews.


To truly modernize a family history workflow, watching the full session is highly recommended. It offers practical, step-by-step demonstrations of how to turn messy research notes into structured, GEDCOM-ready data that can be seamlessly integrated into a digital tree. Beyond the video, the accompanying syllabus contains a wealth of additional resources, including a data standardization checklist and specialized verb lists to help craft more effective prompts. Dive into these materials to refine your research process and ensure your family's story is preserved with the highest level of technical precision.

Comments (32)

Sort byNewest
  1. TQ
    Tamara Quiring
    6 days ago

    Loving this series and learning a lot

  2. BG
    Brenda Glover
    6 days ago

    This week's session helped me to understand how to check for family tree for errors using AI. I have a habit of transposing numbers so this edit function will serve as a great quality control feature...I also loved the prompt samples which I have added to my prompt template toolbox/file folder. Thanks to you both and to other participants for their questions, I am learning from them as well.

  3. JM
    Joan McWilliams
    6 days ago

    Many good suggestions for writing AI prompts.

  4. LD
    Lawrence Doyle
    6 days ago

    Good hints and tips. Standarizing information was most helpful.

  5. JG
    John Goulait
    6 days ago

    Every time I listen to a webinar, I pick up knowledge for genealogical use. The world and the tools in it are ever improving.

  6. JM
    Jean Mayo
    6 days ago

    This series is absolutely blowing me away. So very useful in my research! I would love to see more of these 5 session classes as they can go into so much more depth. Bravo Andrew and Fiona! I appreciate all that you have done in putting these together. The demos are wonderful!

  7. JR
    John Ratliff
    6 days ago

    Another excellent presentation of an important emerging resource for all of us to use. Will work to become more informed in the actual workings of AI and look forward to the next webinar in 2 weeks.

  8. CH
    Carol Harper
    6 days ago

    The Redfern & Brooker Ai sessions are so much moe than a 5! They work together seamlessly, joyfully, and engagingly to teach us about how to get the most and the best from AI to meet our needs and, in fact, to limit those needs so we can deal with the abundance of information AI can provide. This is a truly amazing series, as was their first one!

From Research to Data with AI 2 of 5: Cleaning Up—Fixing, Formatting, and Validating Data - Legacy Family Tree Webinars