8/8/23 CELR Data Lake Migration Meeting notes

8/8/23 CELR Data Lake Migration Meeting notes


Aug 8, 2023








Dari Shirazi: X

Megan Light: --

Erroll Rosser: X

Vanessa Holley : --

Teresa Jue: --

Kristin Peterson : --

Gretl Glick: X

Cheri Gatland-Lightener: --

Tom Russell: --

Geo Miller: X

Norris Kpamegan :

Marcelo Caldas: X

Alissa McShane: --

Ryan Harrison: X

Don Lindsay: --

John Reaves: X

Marion Anandappa: --


Emily: X

Cosmin X



Rohit Panwar: X



Leslyn Mcnabb: X



  • Update on Data Lake Migration Status

  • DEX Overview and Status Update

Discussion topics





Overall Status Updates

  • AIMS connection to DEX options: Research/Analysis: In Progress

    • AIMS/APHL Updates:

      1. Option 1: SDS (Send to DEX Service) / Upload API

        1. Tested connectivity successfully

        2. Prototyped runtime using Docker image & lambda using aims sandbox

      2. Action Item [Completed]: AIMS confirmed volume for CELR; 1-Max flow during pandemic; 2-current flow (concurrency and file size):

        1. Provided to DEX team 7/26/23 (AIMS-46695)

      3. DEX: Complete internal load testing:

        1. CELR load testing is sent to DEX STG: 4-6 weeks to complete internal load testing (mid-August ~)

          1. DEX Update:

        2. Action Item [In Progress] : Confirm & identify ongoing operations, resources, support/training: Tentative: Late August

        3. AIMS: Send CELR Production data (n = 24 hours~) to DEX STG Environment: Target: ~August 21, 2023 (will confirm dates as we assess readiness)

          1. Update:

            1. DEX: Not yet started load testing, auto-scaling focus, health checks have been priority

            2. Next sprint: focus on dev environment, auto scale in dev, will be working on auto-scaling in step-wise fashion --starts in dev, then onto STG (will be running load in STG, monitor stats)

            3. Auto-scaling should be available in STG by 9/12 (30 instances available)

            4. ~2 sprints away: Aug 31st--targeted readiness for DEX

            5. Load testing sending CELR Prod data to DEX STG:: would include HL7 and CSV --New Target: Week of September 11, 2023

          2. Criteria/Parameters for ingestion component testing:

            1. Volume: 10-20 terabytes per month; ~50 gig/day

            2. Period of Reporting: DEX: Preference is to receive 3 days of data [completed days--sequence-- low, average, high volume] --Include CDPH

            3. Data Type (are CSV submissions in scope?): HL7 and CSV

          3. Action item: Decision: AIMS will send split data feed with live data, over 3 day period to DEX: Tuesday through Thursday: September 12-14, 2023 [Geo is OOO week of September 11, but will assign alternative AIMS resource]

            1. DEX: Will send report of files received, volume to AIMS to verify counts--ingest components post send

            2. If needed, would AIMS to be re-do on the following week? AIMS: Yes

Mtg Notes:

  • Notes:


  • LM: Would like to confirm--this first test is just testing volume, not validation of HL7 pipelines/CSV pipelines:

    • APHL: Focus would be primarily on volume, ingestion

    • RH: Test--Testing on 9/11 would be test ingestion--component testing [upload api receive], not validation of HL7

      • When HL7 and CSV validation is available, could re-run CELR data to test validation pipelines [

      • DEX: 2 connectivity tests:

        • 1st test: Week of 9/11: component/ingestion test

        • 2nd test: End to end test to verify validation pipelines

      • Action Item: ER: Need to identify scope and criteria for 2nd E2E test; Action Item: Schedule discussion to identify criteria, scope for validation E2E test @Gretl Glick @Erroll Rosser

        • e2e refers to transport, ingest, validation, and delivery to program;

        • if E2E is compared to current CELR, the counts will be slightly different:

          • CELR is currently conducting validation on AIMS; in new paradigm, DEX is conducting validation of HL7

        • Happy path: counts should match

        • If counts do not match, then how can we verify volume testing?

          • Time alignment may also be challenge; may want to compare counts for specific files, not time period

          • Validation counts may be out of scope for current period

          • Ingestion counts can be verified:

            • Dari: May be challenging to confirm counts--100 files sent, for CSV and HL7 (but would not be able to confirm row-level number of results in CSV files)

  • LM: Would load testing be a batch of HL7 and CSV? Are they mixed together for sending?

    • AIMS: Can do both, more valid test to do both to replicate volume

    • DEX: Data type should not matter, should do both to replicate real-world scenario

  • ER: What happens if specific day does not meet expected load testing (too low)?

    • AIMS: Splitting feed accommodates sending in real time, would leave on so flow of data replicates real-world data submission; can also manually inject files (e.g. CDPH) into test if needed to meet expected load volume

    • ER: Historic trend is decreasing; if historic data from 2021~ chosen, then would be true stress test; otherwise 2023 reflects a lower trend in data reporting

    • AIMS: Sending data over day, would list everything and send as fast possible, but would not reflect cadence of data being sent in real-time; would reflect a more extreme stress test

    • Consensus: Split feed

  • Go/No-go meeting: 9/7: 12-1pmET @gretl will send updated meeting invite

Meta Data


Next Steps & Action Items

Next steps:

  • In progress: DEX: Continue internal DEX performance/load testing: Target Readiness: August 31, 2023

  • In progress: @Gretl Glick Schedule go/no-go meeting: Sept 7, 2023: 12-1pmET

  • In progress: @Gretl Glick @Erroll Rosser Schedule meeting to discuss end to end validation testing criteria/scope

  • Not yet started: AIMS: Send CELR Production data (n = 24 hours~) to DEX STG Environment: Target: ~Sep 12-14, 2023

Previous Action items

  • In progress: @Geoffery Miller @dari.shirazi@aphl.org AIMS: Confirm & identify ongoing operations, resources, support/training for AIMS>DEX, Upload API: Tentative: Late August

  • In progress: DEX: Continue internal DEX performance/load testing: Target Readiness: August 31, 2023

  • In progress: @Gretl Glick Schedule go/no-go meeting: Sept 7, 2023: 12-1pmET

  • In progress: @Gretl Glick @Erroll Rosser Schedule meeting to discuss end to end validation testing criteria/scope

  • Not yet started: AIMS: Send CELR Production data (n = 24 hours~) to DEX STG Environment: Target: ~Sep 12-14, 2023


Related content

9/6/23 CELR Data Lake Migration Meeting notes
9/6/23 CELR Data Lake Migration Meeting notes
More like this
6/29/23 CELR Data Lake Migration Meeting notes
6/29/23 CELR Data Lake Migration Meeting notes
More like this
8/24/23 CELR Data Lake Migration Monthly Meeting notes
8/24/23 CELR Data Lake Migration Monthly Meeting notes
More like this
3/8/23 CELR Data Lake Migration Meeting notes
3/8/23 CELR Data Lake Migration Meeting notes
More like this
4/27/23 CELR Data Lake Migration Meeting notes
4/27/23 CELR Data Lake Migration Meeting notes
More like this
1/8/2024 CELR Data Lake Migration Quarterly Meeting notes
1/8/2024 CELR Data Lake Migration Quarterly Meeting notes
More like this