9/6/23 CELR Data Lake Migration Meeting notes

Date

Sep 6, 2023

Attendees:

APHL

CDC

Peraton

APHL

CDC

Peraton

Dari Shirazi:

Megan Light: --

Erroll Rosser: X

Vanessa Holley : X

Teresa Jue: --

Kristin Peterson :

Gretl Glick: X

Cheri Gatland-Lightener: --

Tom Russell:

Geo Miller: X

Norris Kpamegan :

Marcelo Caldas: X

Alissa McShane: X

Ryan Harrison: X

Don Lindsay:

John Reaves: X

Marion Anandappa:

Scott Rank

 

Cosmin P. X

 

 

Rohit Panwar: X

 

 

Leslyn Mcnabb:

 

 

Emily Augistini: X

 

Goals

  • Update on Data Lake Migration Status

  • DEX Overview and Status Update

  • Go/No-Go Decision for load testing CELR>DEX Sep 12-14

Discussion topics

Item

Notes

Item

Notes

Overall Status Updates

  • AIMS connection to DEX options: Research/Analysis: In Progress

    • AIMS/APHL Updates:

      1. Development tasks completed, resources identified: Ready for load testing Sep 12-14

        1. Updates/blockers:

          1. AIMS: Ready to go!

            1. Infrastructure installed, completed; testing completed with DEX

            2. Resources identified for load testing period

            3. Integration team--mirth channel established --split channel for S3 bucket, copied over for load testing [Wouter and Charles will be online for monitoring]

        2. Rohit/DEX: Are the files submitted sequentially or parallel?

          1. AIMS: Files will be sent in parallel as they arrive at AIMS will be sent to DEX

          2. Control is available to limit concurrency to only ~10/time [DEX--limit not needed]

          3. Cosmin: Would make sense to limit to 100/time (would not receive a million at a time]

            1. AIMS:Is functionally able to load limit

        3. DEX: Consensus: Do not set limit for sending, load testing intention is to determine if Dex can handle large volume of data

      2. DEX: Complete internal load testing:

        1. Update:

        2. AIMS: Send CELR Production data (n = 24 hours~) to DEX STG Environment: Target: ~August 21, 2023 (will confirm dates as we assess readiness)

          1. Load testing sending CELR Prod data to DEX STG: would include HL7 and CSV

          2. Criteria/Parameters for ingestion component testing:

            1. Volume: 10-20 terabytes per month; ~50 gig/day

            2. Period of Reporting: DEX: Preference is to receive 3 days of data [completed days--sequence-- low, average, high volume]

            3. Data Type: HL7 and CSV

          3. Dex Update:

            1. Readiness:

            2. DEX: Auto-scaling is on in dev, load testing in dev successfully; changes in progress to apply in STG, additional testing in progress in STG (target: 9/7/23)

          4.  

Mtg Notes:

  • Notes:

    • Load Testing Checkpoints Meetings: Daily checkpoint

      • Attendees: Erroll, Upload Team Dex: Anand [Rohit as backup], AIMS: Charles Steele [PT timezone], Wouter-Will have counts to share

      • T, W, TH = 3pm ET

      • @Gretl Glick will send out meeting invite for 9/12, 13, 14 at 3pm

Consensus Decision

Decision:

  • AIMS is ready; Dex is ready = Go meeting

 

 

Load test Success Criteria:

  1. DEX can handle all messages, receive, and process all data

  2. Upload Time/Performance – how long it takes to upload files based on file size

    1. AIMS: Does not have functionality to capture metric of file size currently; long-term observability functionality is part of dev backlog

      1. Would server side be able to capture metrics? DEX: Can monitor metrics on server side for performance; For every upload, can review start time and end time for every upload

      2. Dex: focus is on how long does it take to upload large files

  3. File counts match b/w AIMS and DEX

    1. If file fails to upload, will not resume upload; if file does NOT complete (e.g. if connection is dropped), it will re-try later

      1. AIMS: If file upload is interrupted, would re-start file upload from start [potential for partial uploads if connection is interrupted]; this would take a large LOE to re-start file upload from partial upload/resume file upload; if transmission stops, essentially container would close

      2. DEX: Small files this should not be an issue; large files may be an issue

      3. Will monitor during load testing;

      4. DEX: In order to transition to production, will need to have resume file upload using tu..? client [Long-term requirement for production cutover]; Lambda time-bound currently using the current infrastructure --15 minutes (2GB-4gb)

      5. CDPH: File size: 600-800k records per file; updating to smaller file size --sent on Wed afternoon

      6. AIMS isn’t using Tus resume ==> DEX: Look for incomplete uploads Reported: Geo We are not resuming with the Tus client We are not using tus client retry We just close the lambda If an upload does not complete – e.g., the connection is dropped – failed transmission. We restart upload with a new tguid
        Failure mode: Lambda timeouts Reported: Geo Lambda timeouts (15 min)
        Failure mode: Largest file: California file YTD (Year To Date) Sent on Wed or Thu Emily Augustini: 600-800k records

Questions

  •  

Next Steps & Action Items

Next steps:

  • @Gretl Glick will send out meeting invite for 9/12, 13, 14 at 3pmET

  • @Gretl Glick Send out Doodle poll debrief/retrospective meeting for following --what went well, what did not go well--next steps/action items

 

 

Previous Action items

  • Not yet started: AIMS: Send CELR Production data (n = 24 hours~) to DEX STG Environment: Target: ~Sep 12-14, 2023