hero






Lead Cloud Operations Engineer

University of Chicago

University of Chicago

This job is no longer accepting applications

See open jobs at University of Chicago.
Operations
Chicago, IL, USA
Posted on Saturday, December 16, 2023

Department

BSD CTD - Platform Engineering - Ops


About the Department

The Center for Translational Data Science (CTDS) at the University of Chicago is a research center whose mission is to develop the discipline of translational data science to impactful problems in biology, medicine, healthcare, and the environment. We envision a world in which researchers have ready access to the data needed and the tools required to make data driven discoveries that increase our scientific knowledge and improve the quality of life. We architect ecosystems of large-scale commons of research data, computing resources, applications, tools, and services for the broader research community to use data at scale to pursue scientific inquiry and accelerate discovery. Learn more at https://gdc.cancer.gov/, https://gen3.org/, https://stats.gen3.org/, and https://ctds.uchicago.edu/.

This at-will position is wholly or partially funded by contractual grant funding which is renewed under provisions set by the grantor of the contract. Employment will be contingent upon the continued receipt of these grant funds and satisfactory job performance.


Job Summary

As the Lead Cloud Operations Engineer, you will play a pivotal role in designing, configuring, managing, and supporting our expansive cloud computing infrastructure. You will provide technical leadership to a team responsible for overseeing the operations of 25,000+ cores and 15+ PB of storage of controlled and open access cancer genomics data. Innovation is key, as you lead efforts to optimize infrastructure setup, configuration, and refresh, fostering collaboration and efficiency across various subsystems. Serving as both a technical leader and project manager, you will guide the administration of operating systems, implement upgrades, and maintain security measures. Your advanced experience in infrastructure operations, system administration will drive the success of our dynamic and expanding infrastructure.

Responsibilities

  • Lead the design, configuration, management, and support of our large-scale cloud computing infrastructure.

  • Oversee operations of 25,000+ cores and 15+ PB of storage, built primarily on commodity hardware running GNU/Linux.

  • Track, implement and lead security and compliance activities in cooperation with information security team.

  • Respond to operational incidents promptly, identifying operational risks and addressing them effectively, processing RMAs and reporting operational data and statistics regularly.

  • Innovate and foster innovation with them team and within the infrastructure, including optimizing our infrastructure setup, refresh, and configuration to improve our operational efficiencies across various subsystems.

  • Lead support of the rapid growth of our existing large physical infrastructure in a hybrid model with public clouds in a secure, stable, and maintainable manner.

  • Maintains broad technical knowledge of existing and emerging technologies, including developments in hardware offerings and public cloud offerings from Amazon Web Services, Microsoft Azure, and Google Cloud.

  • Ensure and optimize operational efficiencies through automation, innovation, and inter- and intra-team collaboration.

  • Lead and implement design, set up, provisioning, and deployment of new systems to support multi- and hybrid- cloud architecture and expansion.

  • Lead enhancement of infrastructure and application monitoring.

  • Mentor, coach, and train other members on the team, serving as their source of technical leadership.

  • Lead technical aspects of projects for systems administration team, including delegation of tasks and organizing and managing meetings.

  • Serve as scrum master/project manager for day-to-day work, preparing team sprints, tracking velocity, notifying partners of changes and deviations, etc.

  • Solves complex problems to configure, install, upgrade, and maintain server applications and hardware. Works to safeguard the integrity of computer software. Implements operating system enhancements to improve the reliability and performance of the system.

  • Guides the administration of operating systems, maintains security, and implements backup procedures for the organization's information systems and peripheral equipment, such as servers, desktops, printers, and storage devices.

  • Provides expertise in planning and installing necessary patches and upgrades for servers and their associated storage, network, communications, and peripheral sub-systems. Installs and maintains an appropriate level of intrusion detection, monitoring, and auditing software as required.

  • Tracks compliance and maintains documentation for hardware, software, and service inventories for management reports.

  • Performs other related work as needed.


Minimum Qualifications

Education:

Minimum requirements include a college or university degree in related field.

---
Work Experience:

Minimum requirements include knowledge and skills developed through 7+ years of work experience in a related job discipline.

---
Certifications:

---

Preferred Qualifications

Experience:

  • Advanced experience in infrastructure operations and effective provisioning, automation, installation/configuration, operation, security and maintenance of systems hardware, software, and related network infrastructure for use by all external and internal users.

  • 7+ years of professional experience providing system administration with increasing technical and service responsibility.

  • 5+ years of experience of leading or owning important aspects of the infrastructure work (e.g., networking, storage, virtualization, monitoring) for a large-scale system.

  • Exceptional experience managing GNU/Linux servers required and expert level knowledge of systems administration tools, languages, and shell scripts.

  • Experience drafting reports, diagrams and documentation describing systems and procedures.

  • Extensive experience standing up, provisioning, and managing on-prem infrastructure as well as commercial clouds (AWS and GCP), and in-depth knowledge with configuration and management tools such as OpenStack, Chef, Salt, Kubernetes, etc.

  • Experience with advanced infrastructure and application monitoring, log aggregation and reporting.

  • 2+ years technical team leadership experience.

  • Experience with infrastructure budget/procurement and project management and implementation experience.

  • Experience working with sensitive data (PII, PHI, CUI, etc.).

Preferred Competencies

  • Ability to break down complex technical offerings to concise information for non-technical leadership and financial administrators to use for decision making.

  • Ability to lead a team and maintain a cross-functional team collaborative environment.

  • Self-driven, detail-oriented, and highly experienced working on all aspects of computer systems from low-level hardware to advanced tasks such as automating and optimizing virtual machine deployment and configuration management.

  • Exceptional deductive and investigative skills to identify and diagnose complex, nonintuitive technical problems.

  • Ability to apply in-depth knowledge and experience of internal or external business issues to improve products or services.

  • Familiarity with integration and management issues in a heterogeneous cloud based computing environment.

  • Ability to learn new procedures, techniques, and approaches quickly.

  • Ability to effectively assist and train members of all levels of ability.

  • Knowledge of bioinformatics and genomic data.

  • Knowledge of large-scale backup/storage system (tape/object store).

  • Knowledge of cybersecurity best practices and compliance with FISMA and FedRAMP.

Working Conditions

  • Office & data center environment with a hybrid schedule available.

Salary Range

  • $130K - $155K

Application Documents

  • Resume (required)

  • Cover Letter (preferred)


When applying, the document(s) MUST be uploaded via the My Experience page, in the section titled Application Documents of the application.


Job Family

Information Technology


Role Impact

Individual Contributor


FLSA Status

Exempt


Pay Frequency

Monthly


Scheduled Weekly Hours

40


Benefits Eligible

Yes


Drug Test Required

No


Health Screen Required

No


Motor Vehicle Record Inquiry Required

No


Posting Statement

The University of Chicago is an Affirmative Action/Equal Opportunity/Disabled/Veterans and does not discriminate on the basis of race, color, religion, sex, sexual orientation, gender, gender identity, national or ethnic origin, age, status as an individual with a disability, military or veteran status, genetic information, or other protected classes under the law. For additional information please see the University's Notice of Nondiscrimination.

Staff Job seekers in need of a reasonable accommodation to complete the application process should call 773-702-5800 or submit a request via Applicant Inquiry Form.

We seek a diverse pool of applicants who wish to join an academic community that places the highest value on rigorous inquiry and encourages a diversity of perspectives, experiences, groups of individuals, and ideas to inform and stimulate intellectual challenge, engagement, and exchange.

All offers of employment are contingent upon a background check that includes a review of conviction history. A conviction does not automatically preclude University employment. Rather, the University considers conviction information on a case-by-case basis and assesses the nature of the offense, the circumstances surrounding it, the proximity in time of the conviction, and its relevance to the position.

The University of Chicago's Annual Security & Fire Safety Report (Report) provides information about University offices and programs that provide safety support, crime and fire statistics, emergency response and communications plans, and other policies and information. The Report can be accessed online at: http://securityreport.uchicago.edu. Paper copies of the Report are available, upon request, from the University of Chicago Police Department, 850 E. 61st Street, Chicago, IL 60637.

This job is no longer accepting applications

See open jobs at University of Chicago.