Data Pipeline Development for AWS Database

Remote, USA Full-time
Job Posting: AWS Data Pipeline Engineer (ETL - Multi-Source to PostgreSQL) Project Overview Build a production-ready ETL pipeline that extracts data from 3 source systems (Oracle/SAP, Microsoft Dataverse CRM, MySQL), transforms it with complex business logic, and loads into AWS RDS PostgreSQL as 3 master tables. Daily automated execution serving an app on AWS. Technical Stack (Required) AWS Services: Glue (PySpark), Step Functions, RDS PostgreSQL, S3, Secrets Manager, CloudWatch Languages: Python 3.9+, PySpark 3.x, SQL IaC: Terraform or CloudFormation Sources: Oracle (JDBC), MySQL (JDBC), Dataverse API (OAuth 2.0) Scope Extract: 10 tables from 3 DBs (~500MB total) Transform: Complex joins, aggregations, calculated fields to 3 master tables Load: PostgreSQL with Row-Level Security policies Orchestrate: Step Functions with error handling, monitoring, alerting Schedule: Daily execution, less than 2 hour completion time Key Challenges Multi-source integration (JDBC + REST API) Complex transformations (multi-table joins, aggregations, JSONB structures) PostgreSQL RLS implementation (role-based data access) Data quality validation and reconciliation GDPR compliance (EU region, encryption) Deliverables Code: Glue extraction/transformation/load jobs, Step Functions workflow, tests (greater than 80% coverage) IaC: Terraform/CloudFormation for all AWS resources Database: DDL scripts with RLS policies Documentation: Architecture diagram, deployment guide, runbook Testing: Integration test suite, performance test results Required Skills ✅ 5+ years AWS (Glue, Step Functions, RDS) ✅ Expert PySpark for complex ETL ✅ PostgreSQL (including RLS) ✅ JDBC connections (Oracle, MySQL) ✅ REST API integration (OAuth, pagination) ✅ Infrastructure as Code ✅ Data quality frameworks Highly Desirable Microsoft Dataverse/Dynamics 365 API experience AWS Glue Data Catalog CI/CD pipelines GDPR compliance experience Timeline & Budget Duration: TBD Availability: TBD Payment: Milestone-based Budget: negotiable based on experience To Apply Provide: Portfolio: Links to similar AWS ETL projects (GitHub preferred) Brief approach: How would you architect this pipeline? (200 words) Dataverse experience: Have you worked with Dataverse API? Describe briefly. Availability: Start date and weekly hours Rate: fixed-price proposal Questions Largest data volume processed with AWS Glue? Optimization techniques used? Experience with PostgreSQL Row-Level Security? Terraform or CloudFormation preference and why? Ideal Candidate Built production ETL pipelines on AWS for enterprise clients Comfortable with complex transformations and business logic Writes clean, testable, maintainable code Works independently, communicates proactively Can deliver production-quality work with minimal supervision Location: Remote (EU timezone preferred) Client: AWS Advanced Partner, manufacturing client in Greece Region: EU (Frankfurt) for GDPR compliance Tags: AWS Glue, PySpark, ETL, PostgreSQL, Data Pipeline, AWS Step Functions, Python, Oracle, MySQL, Dataverse, Terraform, Data Engineering Apply tot his job
Apply Now

Similar Jobs

Specialist/ Marketing Data Operations

Remote, USA Full-time

CI CD Software Pipeline Engineer

Remote, USA Full-time

Experienced Japanese Language Data Associate I – Machine Learning Data Operations Specialist for E-commerce Search Services

Remote, USA Full-time

Senior Staff GenAI/ ML Ops Engineer

Remote, USA Full-time

Looking for Data Modeler with Redshift

Remote, USA Full-time

Data Modeler/Architect_Independent Consultants only

Remote, USA Full-time

Data Warehouse Architect (DBT, Enterprise Data Modeling)

Remote, USA Full-time

Apps Dev Senior Manager Data Modeling 6460 LAS COLINAS BLVD IRVING

Remote, USA Full-time

Software Engineer- Data Engineering (Staff/ Sr Staff)

Remote, USA Full-time

Senior Systems Engineer – High-Performance Data Pipeline (M38 Project)

Remote, USA Full-time

Entry-Level Data Entry Clerk – Remote Work Opportunity for Detail-Oriented Individuals with a Passion for Market Research and Data Integrity

Remote, USA Full-time

Experienced Customer Service Representative – Remote Work at Home Chat Support Assistant for Dynamic Global Brand arenaflex

Remote, USA Full-time

Pediatric Registered Nurse (RN) - Flexible schedule with extensive benefits (ODESSA)

Remote, USA Full-time

Blue cross blue shield Customer Service Representative – Apply NOW

Remote, USA Full-time

**Data Entry Clerk Remote Work From Home - Part-Time Focus Group Panelist Opportunity at blithequark**

Remote, USA Full-time

Senior Governance, Risk, Compliance; GRC Analyst at Oura , NY

Remote, USA Full-time

**Experienced Customer Service Consultant – Aetna Answer Team at blithequark**

Remote, USA Full-time

Administrative Assistant - Master of Social Work (MSW) Program - Remote Opportunity with Sacred Heart University

Remote, USA Full-time

**Experienced Customer Service Representative – Freshers Jobs at arenaflex**

Remote, USA Full-time

**Experienced Full Stack Data Scientist – AI and Machine Learning Development for arenaflex's AIOps Platform**

Remote, USA Full-time
Back to Home