AI Data Architect | Healthcare AI Platform
Genzeon Corporation — Healthcare Division
Exton, PA / Hybrid | 0–4 years | Full-time
AI native Product Architect-Exp in data engineering needed for product build out
The short version: We run a multi-model AI pipeline that processes 150K Medicare documents/year — faxed PDFs, EDI transactions, FHIR data, clinical notes. You’ll design and build the data architecture that ingests, stores, governs, and serves all of it to AI models and clinical reviewers. On-prem GPUs, hybrid cloud, HIPAA compliance. This is the real thing.
What you’ll do:
Design the end-to-end data architecture for a healthcare AI platform — ingestion,storage, processing, serving, governance Build pipelines for heterogeneous healthcare data: faxed PDFs, X12 EDI (835/837/278),FHIR R4, HL7v2, CMS files, unstructured clinical notes Architect the data lake/lakehouse layer (Apache Iceberg, MinIO, DuckDB,PostgreSQL/pgvector)
Design the embedding and vector storage layer that powers RAG — chunking, indexing, retrieval optimization Build data lineage tracking from source document to AI decision
Implement HIPAA/HITRUST data governance — encryption, access controls, audit logging, PHI handling Monitor data quality across the pipeline — schema drift, completeness, freshness, anomalies
Optimize for hybrid infrastructure: on-prem GPUs (RTX 5090, L40S), NAS, Azure GovCloud, Azure Commercial
What you need:
A data pipeline you’ve built that ran in production (we’ll ask about it)
SQL fluency and Python proficiency
Experience with at least one of: Spark, dbt, Airflow, Dagster, Prefect
Hands-on work with unstructured or semi-structured data — PDFs, images, OCR outputs, free text
Practical understanding of vector databases, embeddings, and how RAG systems consume data
Comfort with on-premises infrastructure, not just managed cloud services
Data quality and governance as instincts, not afterthoughts
Strong signals:
Healthcare data formats (X12 EDI, FHIR, HL7, CCD/C-CDA)
Apache Iceberg, Delta Lake, or modern table formats
MinIO / S3 / object storage architecture
pgvector, Pinecone, Weaviate, or similar vector stores
DuckDB or embedded analytical engines
HIPAA technical safeguards implementation
ML data pipelines — training data, feature stores, evaluation sets, feedback loops
We don’t require:
A data engineering bootcamp cert
Mastery of the entire “modern data stack”
Prior healthcare experience (but it helps)
A specific degree
To apply, submit:
1. Resume
2. Link to a data project you’ve built (GitHub, architecture diagram, write-up)
3. 200 words max: “Describe the messiest data problem you’ve encountered. How did you
solve it?”
...Our Owner Operator Truck Drivers Average $6,000 - $11,000 Gross Per Week! CDL A Owner Operator Truck Drivers Only88% of Gross Goes To You 12% Dispatch Fee OTR: $8,000 - $11,000 Regional: $6.000 - $8,000100% Fuel Surcharge Goes To You No Force Dispatch...
...driver's license. Operate various equipment and tools weighing up to 90 pounds. Ability to lift and carry up to 50 pounds.... ...Conditions This position is required to work a 12-hour rotating shift or other defined schedule. This position is subject to callouts...
...Manual QA Tester Overview We are seeking a detail-oriented and analytical Remote Manual QA Tester to support testing efforts... ...activities . Required Skills & Qualifications ~5+ years of experience in Manual Testing. ~ Strong experience in functional...
...Job Description Consultant (Early Childhood Education) I-51611- Open to candidates who recently filed for... ...child care environment and actively works to improve and expand access to high... ...serving New Yorkers. Work From Home Policy: Depending on your position,...
Kickstart Your Career as a Truck Equipment Installer/Up-fitter Technician/Welder Location: Winters, Texas Are you a hands-on problem-solver who loves working with trucks and related equipment? Do you want a career where every day is different, and your skills ...