Upload API: Working ON v1.1-In progress, status of upload
V1.1.2-Will work on authenticating via SAMS SSO
Milestones on progress
Reviewing AIMS connection to DEX options: Research/Analysis: In Progress
Upload API - AIMS has reviewed docs, but have not yet requested to be added to the DEX activity.
Pro: resumable technology looks like it solves a the problem of supporting large file uploads to web apis.
Con: is that it would be a client agent within AIMS that would need continual maintenance and also the ability to track success/failure (queue/dead letter) and team prepared to debug if certificates, credentials, etc change.
AWS data sync to azure blob storage -
Background: the integration for azure blob storage is notably a “preview” feature on AWS’s side.
Pro: job would be configured on each side with no code to maintain.
Con: A downside is that this is a job that has to run on some cron frequency - so data flow will be in chunks as opposed to a “real time” stream.
Alternative research: potential other project’s integration into azure using Azure Connectors for SQS and S3 to see if it would apply to this problem. This would match how AIMS already internally handles event driven processing & would inherit what the team already knows how to use.
Geo is working on proto-typing this option and will have an update next time we meet
Need to investigate the effort on the Azure side to make configurations and/or services to use these connectors.
Research in progress, AIMS has not requested access
Other Options: Will continue to research
Mtg Notes:
CDC: Any needs from DEX team?
AIMS: Need to request access to evaluate options; Azure connections, could test; AWS Data sync--concerns re: Chunking data and setting up EC2 instance; have not continued to testing this option, but can do so
Timeline: Staging data streams set up: ER: Currently in dev, working on QA env [AKA ONB env] by end of Q3 (Late June/July 2023) --[Peraton working on contract extension post 7/23/23]
CDC: So not quite ready to receive messages currently, for next ~6 weeks
Rough timeline: July 2023--but would like to test with dummy data
CDC: 2 week Sprint: Decide on transport option; Next sprint: test dummy data; 3rd sprint: prod data in Staging: ~July 2023
Questions
CDC: Azure Connectors for SQS and S3 to Zaure BLOB: How is meta data communicated to CDC?
Inside of SQS, would provide pointer of object being created, would download from S3--would need to research if meta data would be copied as part of step--need to determine if viable option
CDC: Flagging this--need enough meta data to infer meta data for routing to appropriate program (long-term transport needs to factor in meta data communicated/retention for routing)--is meta data in SQS?
AIMS/: Could enhance SQS meta data; currently in S3 would need to copy to blob storage/ or have 2 different sections of object
AIMS: Would need to run on DEX side, correct? SQS Event from AIMS, DEX would be consuming queue and downloading from S3 (portion which is running is on DEX Side)--would be using Azure connectors on DEX side; would configure Azure connector on DEX side, consume queue, and download data from S3--meta data--can confirm
ER: Vision of frequency of data ingestion from DEX? CDC: Near real-time pull from SQS/S3 in this option (and in other options); latency of less than 5 minutes
AIMS: Con for data sync would rely on cron job, so less than real-time, but will continue to research if it can be near real-time; Timeliness--Azure connector would be near real-time
Other project has used this on AIMS, so will research (EIP+?); ER: Currently using test environment--upload to EIP, goes through S3 into Azure blob, uncertain of transport; MC: EIP+Had to create logic app, and ?, some challenges; Saxion? brings data from AWS into CDC data hub
Is the solution being provided a long-term, broad solution?
CDC: Yes, would hope to use this solution for other data streams, would need to conduct performance test to confirm
AIMS: Would like to leverage use of meta data to migrate older data streams to CDC (e.g. PHINMS reporting streams); or other Data Lakes
ER: Expand Meta Data requirements for other program streams migrations
Current Data Lakes on AIMS:
CELR: Lab Data: Close integration with DEX & CELR Migration teams
EIP+M: Case surveillance Data (Includes FDD MMG (Case V2) & HAI MDRO MMG--Case V2) [Data comes to CDC/MVPS, then is routed to AIMS for storage]
Working with programs
DAART: Antimicrobial Resistance Lab Network Lab Data: Will need to enhance working relationships
CDC: Intent is to identify meta data for transport mechanism, not the actual meta data keys needed for migrating other programs
CDC: Do any of these use PHINMS? And do we have a list of PHINMS Connections?
None of the projects use PHINMS
AIMS: Yes, have a list of PHINMS Connections, but limited progress for retiring PHINMS
CDC: PHINMS retirement strategy--unaware of specific CDC point of contact; AIMS: Have been working with CDC DMB group (MVPS/Joseph Mai xmk0@cdc.gov)--Flu data sent by PHINMS, AMD (4 Mirth channels from AIMS > CDC); fairly large volume of data flowing from AIMS to CDC via PHINMS
DEX: Will need to identify CDC groups to create migration plan for each route --(Potentially Janie Williams is PM for PHINMS?)
DS: Does Geo need additional Azure resources for prototyping? GM: Yes, would be good to have additional Azure resources, but long-term subscription would be NTH; RH: If GM has a CDC account (badged, CDC Email--would be able to request SU account and can request Azure subscriptions), easy to add; more challenging to add if he does not have CDC account) [Dari has CDC account, but needs to renew it]
DS: Can have APHL provide paid Azure subscription
Dex Dev/Migration Phases:
Target: CELR Migration--run messages from pipeline --more immediate: July 2023
Target: DEX transport: Pipeline creation: (dependent on routing mechanisms/meta data inclusion within CDC to programs being created)
Next Steps & Action Items
Next steps:
Geoffery Miller Continue to research options, decide on option, AIMS will create data flow diagram
Gretl Glick Next Meeting: Schedule: June 1, 2023 at 1pm ET
Quick decisions not requiring context or tracking
For quick, smaller decisions that do not require extra context or formal tracking, use the “Add a decision…” function here.
Decisions requiring context or tracking
For decisions that require more context (e.g., documentation of discussion, options considered) and/or tracking, use the decision template to capture more information.
Add Comment