Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Date

Attendees:

APHL

CDC

Peraton

Dari Shirazi:

Megan Light:

Erroll Rosser:

Vanessa Holley : X

Teresa Jue: --

Kristin Peterson :

Brooke Beaulieu: X

Cheri Gatland-Lightener:

Tom Russell: --

Mel Kourbage:

Norris Kpamegan :

Marcelo Caldas:

Gretl Glick: X

Ryan Harrison: X

Don Lindsay: X

Geo Miller: X

Leslyn Mcnabb:

Alissa McShane:

Marion

Goals

  • Update on Data Lake Migration Status

  • DEX Overview and Status Update

Discussion topics

Item

Notes

Overall Status Updates

DEX Development, Status Updates, and Timeline:

Upload API: Working ON v1.1-In progress, status of upload

    • V1.1.2-Will work on authenticating via SAMS SSO

    • Milestones on progress

    Reviewing AIMS connection to DEX options: Research/Analysis: In Progress

    • AIMS/APHL Updates:

      1. SDS (Send to DEX Servce) / Upload API

        1. Tested connectivity successfully

        2. Prototyped runtime using Docker image & lambda using aims sandbox

        3. Next:

          1. Determine what a load/performance test should look like & perform the test

          2. Determine if we want the runtime to be in Lambda, Mirth, Fargate, or k8s

      2. AWS DataSync

        1. Prerequisite for transfer test: EC2 instance to run data sync agent - did not complete this

        2. Tested connectivity to S3 and Azure locations

        3. Further research/additional information needed regarding cost / architecture questions for scaling this solution long term

      3. Azure Logic Apps / Connectors

        1. Further research/additional information needed regarding cost implications in azure to run

        2. Need help understanding continued monitoring & maintenance of a logic app

        3. https://learn.microsoft.com/en-us/microsoft-365/community/power-automate-vs-logic-

      AIMS has reviewed docs, but have not yet requested to be added to the DEX activity.
      • Pro: resumable technology looks like it solves a the problem of supporting large file uploads to web apis.

      • Con: is that it would be a client agent within AIMS that would need continual maintenance and also the ability to track success/failure (queue/dead letter) and team prepared to debug if certificates, credentials, etc change.

    • AWS data sync to azure blob storage -

      • Background: the integration for azure blob storage is notably a “preview” feature on AWS’s side.

      • Pro: job would be configured on each side with no code to maintain.

      • Con: A downside is that this is a job that has to run on some cron frequency - so data flow will be in chunks as opposed to a “real time” stream.

    • Alternative research: potential other project’s integration into azure using Azure Connectors for SQS and S3 to see if it would apply to this problem. This would match how AIMS already internally handles event driven processing & would inherit what the team already knows how to use.

      • Geo is working on proto-typing this option and will have an update next time we meet

      • Need to investigate the effort on the Azure side to make configurations and/or services to use these connectors.

    • Research in progress, AIMS has not requested access

    • Other Options: Will continue to research

Mtg Notes:
        1. apps

      1. RAS (Receive from AIMS Service)

        1. With a slight modification, the SDS can run as a poller

        2. Have built similar poller services many times, can be done with this for performance testing in a few days if there is interest

        3. Would run on the DEX/Azure side to poll SQS and get data from AIMS S3

Image AddedImage Added

Metadata values required by DEX

  • meta_destination_id – unique identifier used to indicate the program associated with

the upload.

  • meta_ext_event – unique identifier used to indicate the event type within the program

that this file belongs to (e.g., routineImmunization).

optional for all:

meta_schema_version

  • How is optional metadata for a specific meta_destination_id/meta_ext_event/meta_schema_version defined?

  • Does this need to be defined before sending metadata values or will DEX allow any additional optional metadata values to be sent?

Mtg Notes:

  • Prototyped download to bucket:

    • No issues using upload API

    • Meta Data Questions

    • Viable path forward

  • Option 2: AWS Data Sync

    • Runs on schedule

    • Cost implications/questions/ blob storage time period

    • Requires EC2 configuration, without monitoring and auditing services

    • Within S3--when copy is done, are the same keys within S3 in Azure? No; Metadata is not copied over to blob metadata--not impossible, but would need to create solution

      • Would need either to …option 1 write custom connector or

      • Questions on long-term monitoring and cost

  • Option 3:

    • Requires EC2 configuration, without monitoring and auditing services

    • Within S3--when copy is done, are the same keys within S3 in Azure? No; Metadata is not copied over to blob metadata--not impossible, but would need to create solution

      • Would need either to …option 1 write custom connector or

      • Questions on long-term monitoring and cost

      • CDC is pulling data

  • Option 4:

    • Almost same as Option 1

    • Instead of pushing to httpm endpoint, could just have DEX pull from AIMS

  • Pros/Cons to all options:

    • Either using upload API, or having DEX pull data from S3 bucket on AIMs is most viable

      • DEX: What is the thing doing the pull? AIMS: Would provide code to DEX, set of library code

      • AIMS: This would be similar to initial data pull for EIP+

      • Could make a new code base, would share code with DEX, used for other integration projects

  • Decision Point:

    • Load test/Performance Test: Using meta data

      • Option 1: Aims will need to decide to where to host, but could start that next week

      • Option 4: Needs a bit of refinement, but also viable

      • Option 2,3: Would need additional developer research/support to implement

    • Uncertain if meta data comes across with Option Data Sync: Scheduling feature, and growth of data would need to be configured/monitored

    • AIMS/DS: Choice needs to be considered from long-term viability and ability to use for other programs/use cases

      • GM: Bucket used would prefix project, meta data used would be mapped to data upload api object, so could be scaled

      • DS: Option 4 is nicer for AIMS, as less code is maintained on AIMS; Polling from CDC; Queue can be monitored, but is available for CDC to poll at any time

        • Question is who is monitoring queue/polling

    • DEX: In Option 4, how is Meta Data received? AIMS: Get object command would make meta data available; DEX: Meta Data would be persisted in Azure blob

  • Leslynn: Are there any programs which APHL would need to pull data back from CDC? Dari: Currently, EIP is the only program which pushes data from CDC, but that would be migrated to CDC Platform intermediate term

  • Leslynn: Total daily volumes--would this be possible to provide per program?

    • AIMS: Can provide overview of volume for DLs, and potentially other programs

      • Action item: Provide counts to Leslynn and Ryan

  • AIMS: Preference would be Option 4; Option 1 would need to be coordinated long-term with DEX, so higher LOE

    • DEX: Coordination would not be needed, DEX would

    • DEX: Could we test Option 1?

DRAFT Data Flow Diagram

Image Added

Questions

Next Steps & Action Items

  • Next steps:

Action items

Quick decisions not requiring context or tracking

For quick, smaller decisions that do not require extra context or formal tracking, use the “Add a decision…” function here.

    Decisions requiring context or tracking

    For decisions that require more context (e.g., documentation of discussion, options considered) and/or tracking, use the decision template to capture more information.

    Create from Template
    spaceKeyCLP
    blueprintModuleCompleteKeycom.atlassian.confluence.plugins.confluence-business-blueprints:decisions-blueprint
    contentBlueprintId7df0012f-f16f-4314-9440-69be210d9c3c
    templateName7df0012f-f16f-4314-9440-69be210d9c3c
    buttonLabelCreate decision from decision template