CDAO Advana - Site Reliability Engineering Lead - Model Serving Job at GDIT, Washington DC

SEEvK0hYK1V5QlpFMkVaSSs1dFZzaXFOOHc9PQ==
  • GDIT
  • Washington DC

Job Description

Responsibilities for this Position

Location: USA DC Washington
Full Part/Time: Full time
Job Req: RQ220265

Type of Requisition:
Pipeline

Clearance Level Must Currently Possess:
Top Secret

Clearance Level Must Be Able to Obtain:
Top Secret/SCI

Public Trust/Other Required:
None

Job Family:
IT Infrastructure and Operations

Job Qualifications:

Skills:
Artificial Intelligence (AI), DevSecOps, Kubernetes, Reliability Analysis
Certifications:
None
Experience:
8 + years of related experience
US Citizenship Required:
Yes

Job Description:

Join GDIT and be a part of the team of men and women that solve some of the world's most complex technical challenges. The CDAO Advana team is seeking an Site Reliability Engineering Lead - Model Serving, to join their efforts in the DC area.

Advana is the Chief Digital and Artificial Intelligence Office's (CDAO) enterprise-wide, multi-domain data, analytics, and artificial intelligence (AI) platform that provides all DoD military and civilian decision makers, analysts, and builders with unprecedented access to enterprise data, tools, and capabilities.

This is a proposal with award expected June 2026. If interested, please apply as we are interviewing and making contingent offers now.

Duties include:

  • Owns production reliability strategy for artificial intelligence and machine learning model serving across Advana enclaves supporting Department of Defense missions, Joint Staff analysts, Combatant Command elements, and Senior Executive Service leadership.
  • Defines service-level objectives, alerting philosophy, operational runbooks, and release safety patterns governing production deployment of model artifacts across multiple security domains.
  • Establishes reliability governance across serving surfaces by developing operational standards, on-call expectations, escalation pathways, and incident response patterns aligned with enterprise DevSecOps practices.
  • Implements reliability engineering methodologies using Kubernetes, Prometheus, Grafana, Elastic Stack, GitLab Continuous Integration, VMware environments, and hardened deployment pipelines to maintain operational stability, mission assurance posture, and cross-domain readiness.
  • Develops automated reliability checks integrated into deployment workflows to validate performance, latency, availability, and operational suitability of production-ready models.
  • Leads coordination with Platform One, Cloud One, multi-national engineering teams, and cross-service mission partners to align reliability strategy with evolving architectures, security requirements, and mission priorities.
  • Produces mission-critical deliverables including service-level objective documentation, alerting configurations, operational runbooks, reliability scorecards, incident post-action reports, and release safety assessments.
  • Strengthens program value by advancing operational readiness, reducing mission risk, and reinforcing deployment consistency across all enclaves.
  • Supports Tier-4 incident response actions by maintaining authoritative reliability artifacts required for rapid triage, operational continuity, and sustained mission performance.

Basic Qualifications:
  • BS degree; additional years of experience may be considered in lieu of degree
  • 8+ years of experience developing reliability strategy
  • AI and machine learning experience
  • CompTia Security+
  • TS with SCI eligibility

WHAT CAN GDIT OFFER YOU?
  • Excellent customizable health benefits (Medical, Dental and Vision)
  • 401K with company match
  • Educational Assistance and eLearning
  • Flexible work week
  • Internal mobility team dedicated to employee advancement
  • Rewards and Recognition programs
  • Innovative and collaborative environment encouraging of highly motivated critical thinking

The likely salary range for this position is $128,039 - $173,229. This is not, however, a guarantee of compensation or salary. Rather, salary will be set based on experience, geographic location and possibly contractual requirements and could fall outside of this range.

Scheduled Weekly Hours:
40

Travel Required:
None

Telecommuting Options:
Onsite

Work Location:
USA DC Washington

Additional Work Locations:

Total Rewards at GDIT:
Our benefits package for all US-based employees includes a variety of medical plan options, some with Health Savings Accounts, dental plan options, a vision plan, and a 401(k) plan offering the ability to contribute both pre and post-tax dollars up to the IRS annual limits and receive a company match. To encourage work/life balance, GDIT offers employees full flex work weeks where possible and a variety of paid time off plans, including vacation, sick and personal time, holidays, paid parental, military, bereavement and jury duty leave. To ensure our employees are able to protect their income, other offerings such as short and long-term disability benefits, life, accidental death and dismemberment, personal accident, critical illness and business travel and accident insurance are provided or available. We regularly review our Total Rewards package to ensure our offerings are competitive and reflect what our employees have told us they value most.

We are GDIT. A global technology and professional services company that delivers consulting, technology and mission services to every major agency across the U.S. government, defense and intelligence community. Our 26,000 experts extract the power of technology to create immediate value and deliver solutions at the edge of innovation. We operate across 50 countries worldwide, offering leading capabilities in digital modernization, AI/ML, Cloud, Cyber and application development. Together with our clients, we strive to create a safer, smarter world by harnessing the power of deep expertise and advanced technology.

Join our Talent Community to stay up to date on our career opportunities and events at
gdit.com/tc .

Equal Opportunity Employer / Individuals with Disabilities / Protected Veterans



PI284598216




Join GDIT and be a part of the team of men and women that solve some of the world's most complex technical challenges. The CDAO Advana team is seeking an Site Reliability Engineering Lead - Model Serving, to join their efforts in the DC area.


Advana is the Chief Digital and Artificial Intelligence Office's (CDAO) enterprise-wide, multi-domain data, analytics, and artificial intelligence (AI) platform that provides all DoD military and civilian decision makers, analysts, and builders with unprecedented access to enterprise data, tools, and capabilities.


This is a proposal with award expected June 2026. If interested, please apply as we are interviewing and making contingent offers now.



Duties include:



  • Owns production reliability strategy for artificial intelligence and machine learning model serving across Advana enclaves supporting Department of Defense missions, Joint Staff analysts, Combatant Command elements, and Senior Executive Service leadership.
  • Defines service-level objectives, alerting philosophy, operational runbooks, and release safety patterns governing production deployment of model artifacts across multiple security domains.
  • Establishes reliability governance across serving surfaces by developing operational standards, on-call expectations, escalation pathways, and incident response patterns aligned with enterprise DevSecOps practices.
  • Implements reliability engineering methodologies using Kubernetes, Prometheus, Grafana, Elastic Stack, GitLab Continuous Integration, VMware environments, and hardened deployment pipelines to maintain operational stability, mission assurance posture, and cross-domain readiness.
  • Develops automated reliability checks integrated into deployment workflows to validate performance, latency, availability, and operational suitability of production-ready models.
  • Leads coordination with Platform One, Cloud One, multi-national engineering teams, and cross-service mission partners to align reliability strategy with evolving architectures, security requirements, and mission priorities.
  • Produces mission-critical deliverables including service-level objective documentation, alerting configurations, operational runbooks, reliability scorecards, incident post-action reports, and release safety assessments.
  • Strengthens program value by advancing operational readiness, reducing mission risk, and reinforcing deployment consistency across all enclaves.
  • Supports Tier-4 incident response actions by maintaining authoritative reliability artifacts required for rapid triage, operational continuity, and sustained mission performance.




Basic Qualifications:

  • BS degree; additional years of experience may be considered in lieu of degree
  • 8+ years of experience developing reliability strategy
  • AI and machine learning experience
  • CompTia Security+
  • TS with SCI eligibility




WHAT CAN GDIT OFFER YOU?

  • Excellent customizable health benefits (Medical, Dental and Vision)
  • 401K with company match
  • Educational Assistance and eLearning
  • Flexible work week
  • Internal mobility team dedicated to employee advancement
  • Rewards and Recognition programs
  • Innovative and collaborative environment encouraging of highly motivated critical thinking



The likely salary range for this position is $128,039 - $173,229. This is not, however, a guarantee of compensation or salary. Rather, salary will be set based on experience, geographic location and possibly contractual requirements and could fall outside of this range.



Scheduled Weekly Hours:
40



Travel Required:
None



Telecommuting Options:
Onsite



Work Location:
USA DC Washington



Additional Work Locations:



Total Rewards at GDIT:
Our benefits package for all US-based employees includes a variety of medical plan options, some with Health Savings Accounts, dental plan options, a vision plan, and a 401(k) plan offering the ability to contribute both pre and post-tax dollars up to the IRS annual limits and receive a company match. To encourage work/life balance, GDIT offers employees full flex work weeks where possible and a variety of paid time off plans, including vacation, sick and personal time, holidays, paid parental, military, bereavement and jury duty leave. To ensure our employees are able to protect their income, other offerings such as short and long-term disability benefits, life, accidental death and dismemberment, personal accident, critical illness and business travel and accident insurance are provided or available. We regularly review our Total Rewards package to ensure our offerings are competitive and reflect what our employees have told us they value most.


We are GDIT. A global technology and professional services company that delivers consulting, technology and mission services to every major agency across the U.S. government, defense and intelligence community. Our 26,000 experts extract the power of technology to create immediate value and deliver solutions at the edge of innovation. We operate across 50 countries worldwide, offering leading capabilities in digital modernization, AI/ML, Cloud, Cyber and application development. Together with our clients, we strive to create a safer, smarter world by harnessing the power of deep expertise and advanced technology.


Join our Talent Community to stay up to date on our career opportunities and events at

gdit.com/tc .


Equal Opportunity Employer / Individuals with Disabilities / Protected Veterans







PI284598216

Job Tags

Full time, Temporary work, Part time, Work at office, Immediate start, Remote work, Worldwide, Flexible hours

Similar Jobs

ConglomerateIT LLC

Senior SAP ISU Billing & EDM Functional Consultant Job at ConglomerateIT LLC

 ...Job Title: Senior SAP - ISU Billing & EDM Functional Consultant Tax Term: W2/1099 Only Location: Atlanta- Remote Employment Type: Contract Overview: POSITION OVERVIEW The BRIGHT program represents a large-scale SAP-ISU EDM billing transformation... 

CareVet Health

Urgent Care Veterinarian| No Overnights | Premium Pay | Crystal City, Missouri Job at CareVet Health

 ...roots, enjoy a strong sense of community, and maintain a healthy work-life balance. In 2022, Crystal City was crowned as the best suburb to buy a home in Missouri by Niche magazine! CareVet Offers You:~ Six figure starting salary plus production with no negative... 

Harvest Labs, Inc.

Toxicologist Job at Harvest Labs, Inc.

We are looking to fill a clinical toxicologist position at our lab in Crowley, Louisiana. Knowledge of clinical laboratory toxicology using LCMS/MS is preferred. This position will help oversee our clinical toxicology operations and report to our Chief Toxicologist and... 

State of Missouri

Administrative Support Assistant JCCC Personnel Office Job at State of Missouri

 ...walls. More than 95 percent of people who enter the prison system ultimately are released. We want to make sure they're good neighbors. We...  ...sick leave per month / Uniforms provided when required / Pre-service and in-service training / Access to credit union / Direct... 

JLL

Intern, Brokerage Job at JLL

 ...people at JLL are shaping the future of real estate for a better world by combining world class...  .... Whether youve got deep experience in commercial real estate skilled trades or technology...  .... What the job involves: This Intern role will support Leasing activities by...