
location_onMeredith Corporation, 1716, Locust Street, Meredith Corporation, Des Moines, Polk County, Iowa, 50309, United States
The AI Infrastructure GPU Operations Team drives deployment planning, execution governance, operational readiness, reliability, and business rhythm for Oracle Cloud Infrastructure's (OCI) rapidly expanding GPU infrastructure portfolio. As AI becomes embedded across our products and services, we help customers turn that promise into a better future for all. Our platform group ensures the reliability and scalability of large-scale GPU fleets, supporting distributed AI training and inference workloads across multi-region clusters.
True innovation starts when everyone is empowered to contribute. We are committed to growing a workforce that promotes opportunities for all, fostering a culture where structured, data-driven leaders can thrive in collaborative environments.
As a Principal Technical Program Manager, you will lead cross-functional programs that connect engineering, platform, operations, business, finance, observability, SRE, network, and leadership teams. You will own the operating mechanisms for regional deployment readiness, GPU fleet health, milestone tracking, executive reporting, incident and change governance, and operational handoff across multiple concurrent GPU operations programs.
This role is designed for a structured, data-driven program leader who values simplicity, scalability, and clear operational mechanisms. You will turn ambiguous technical and operational inputs into clear priorities, metrics, decisions, and action plans. Your day-to-day involves strengthening dashboards, telemetry, documentation, onboarding, playbooks, and repeatable processes to improve how the organization scales. You will also drive the practical use of AI to enhance operations productivity, reduce manual toil, and accelerate triage.
The ideal candidate brings crisp communication, strong ownership, and pragmatic simplification to high-visibility GPU operations programs where disciplined execution, customer impact, and measurable reliability outcomes matter. You will serve as a primary escalation point between engineering and operations teams, resolving priority conflicts and accelerating issue resolution while translating complex situations into accurate narratives for senior stakeholders.
Applications for this role will generally be accepted for at least three calendar days from the posting date or as long as the job remains posted. Candidates are evaluated based on their ability to lead complex, cross-functional initiatives with measurable outcomes and their experience in building cadences, governance mechanisms, and KPI reporting.
Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans' status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.
We are committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, please let us know by emailing accommodation-request_mb@oracle.com or by calling 1-888-404-2494 in the United States.
We encourage employees to give back to their communities through our volunteer programs and offer competitive benefits that support our people with flexible medical, life insurance, and retirement options.
Work model: On-site
Meredith Corporation, 1716, Locust Street, Meredith Corporation, Des Moines, Polk County, Iowa, 50309, United States
Des Moines, Iowa
Experience with cloud infrastructure, AI/ML infrastructure, GPU operations, data center deployment, capacity planning, or large-scale platform operations. Experience supporting large GPU fleets, distributed AI training or inference workloads, or performance-sensitive infrastructure environments. Experience with incident management, root cause analysis, corrective and preventive action tracking, Change Review Board processes, or high-volume change governance. Familiarity with observability, telemetry, RDMA, RoCE, InfiniBand, network fabric health, service health metrics, ticket/incident analytics, or operational dashboarding. Finance, business planning, workforce planning, or operational readiness experience in a technology organization. Track record of influencing senior business and technology leaders without relying on direct authority.