Remote Manager AI System Infrastructure and MLOps Engineering
The Team\n\nThe AI/ML team is funding and building one of the largest computing systems dedicated to nonprofit life science research in the world. This new effort will provide the scientific community with access to predictive models of healthy and diseased cells, which will lead to groundbreaking new discoveries that could help researchers cure, prevent, or manage all diseases by the end of this century.\n\nAs a hands-on Manager of the AI System Infrastructure and MLOps Engineering team, you will be joining the AI/ML and Data Engineering team in CZI Central Tech, with the responsibility for the stability and scalable operations of our leading edge GPU Cloud Compute Cluster. This supports our AI Researchers in their development and training of state-of-the-art models in artificial intelligence and machine learning to solve important problems in the biomedical sciences aligned with CZIโs mission, contributing to greater understanding of human cell function.\nThe Opportunity\n\nAs the Engineering Manager of the AI Infrastructure and MLOps Engineering team, you will be responsible for a variety of MLOps and AI development projects that empower our AI Researchers and help to accelerate Biomedical research across the whole of the AI lifecycle. You will guide our AI Systems Infrastructure and MLOps efforts focused on our GPU Cloud Cluster operations, ensuring that our systems are highly utilized, performant, and stable. You will be working in collaboration with other members of our own AI Engineering team as well as the Science Initiativeโs AI Research team as they iterate and train their deep learning code, optimizing systems operations and in helping to troubleshoot problems encountered by jobs running on the cluster.\nWhat You'll Do\n\n\n* Help to build out the MLOPs and Systems Infrastructure Engineering team, growing the team to support the large scale capacity systems and AI training efforts we will be undertaking.\n\n* Drive our MLOps processes and System Infrastructure Engineering efforts in ensuring that our GPU Cloud computing systems are highly utilized and stable, and proactively guide our team in implementing the instrumentation and observability tooling integral to our AI Platform.\n\n* Own the on-call efforts for our GPU Cloud computing systems, building out the MLOps and Systems Infrastructure Engineering alerting and monitoring efforts for our leading edge Kubernetes based AI platform, including troubleshooting problems encountered on the GPU platform infrastructure and with jobs running on the cluster and computing systems.\n\n* Responsibility for a variety of AI/ML development infrastructure, instrumentation, and telemetry projects that empower our team in supporting our users across the AI/ML lifecycle, taking a key role in simplifying and optimizing the systems and processes that are integral to our GPU Cloud Cluster operations - in an MLOps meets SRE kind of hybrid operations model.\n\n* Mentoring and managing your team in fulfilling their roles to the best of their abilities, provide skill and career coaching to help the team members keep growing along their own career and life paths, and keep the team engaged in meaningful and interesting projects in service of our north star philanthropic mission\n\n\n\nWhat You'll Bring\n\n\n* Hands-on AI/ML Model Training Platform Operations experience in an environment with challenging data and systems platform challenges\n\n* MLOps experience working with medium to large scale GPU clusters in Kubernetes, HPC environments, or large scale Cloud based ML deployments (Kubernetes Preferred)\n\n* BS, MS, or PhD degree in Computer Science or a related technical discipline or equivalent experience\n\n* 2+ years of experience managing MLOps teams\n\n* 7+ years of relevant coding and systems experience\n\n* 7+ years of relevant coding and systems experience\n\n* 7+ years of systems Architecture and Design experience, with a broad range of experience across Data, AI/ML, Core Infrastructure, and Security Engineering\n\n* Strong understanding of scaling containerized applications on Kubernetes or Mesos, including solid understanding of AI/ML training with containers using secure AMIs and continuous deployment systems that integrate with Kubernetes or Mesos. (Kubernetes preferred)\n\n* Proficiency with Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure, and experience with On-Prem and Colocation Service hosting environments\n\n* Solid coding ability with a systems language such as Rust,C/ C++, C#, Go, Java, or Scala\n\n* Extensive experience with a scripting language such as Python, PHP, or Ruby (Python Preferred)\n\n* Working knowledge of Nvidia CUDA and AI/ML custom libraries. \n\n* Knowledge of Linux systems optimization and administration\n\n* Understanding of Data Engineering, Data Governance, Data Infrastructure, and AI/ML execution platforms.\n\n* PyTorch, Karas, or Tensorflow experience a strong nice to have\n\n\n\nCompensation\n\nThe Redwood City, CA base pay range for this role is $214,000 - $321,000. New hires are typically hired into the lower portion of the range, enabling employee growth in the range over time. Actual placement in range is based on job-related skills and experience, as evaluated throughout the interview process. Pay ranges outside Redwood City are adjusted based on cost of labor in each respective geographical market. Your recruiter can share more about the specific pay range for your location during the hiring process.\nBenefits for the Whole You \n\nWeโre thankful to have an incredible team behind our work. To honor their commitment, we offer a wide range of benefits to support the people who make all we do possible. \n\n\n* CZI provides a generous 100% match on employee 401(k) contributions to support planning for the future. \n\n* Annual funding for employees that can be used most meaningfully for them and their families, such as housing, student loan repayment, childcare, commuter costs, or other life needs.\n\n* CZI Life of Service Gifts are awarded to employees to โlive the missionโ and support the causes closest to them.\n\n* Paid time off to volunteer at an organization of your choice. \n\n* Funding for select family-forming benefits. \n\n* Relocation support for employees who need assistance moving to the Bay Area\n\n* And more!\n\n\n\nCommitment to Diversity\n\nWe believe that the strongest teams and best thinking are defined by the diversity of voices at the table. We are committed to fair treatment and equal access to opportunity for all CZI team members and to maintaining a workplace where everyone feels welcomed, respected, supported, and valued. Learn about our diversity, equity, and inclusion efforts. \n\nIf youโre interested in a role but your previous experience doesnโt perfectly align with each qualification in the job description, we still encourage you to apply as you may be the perfect fit for this or another role.\n\nExplore our work modes, benefits, and interview process at www.chanzuckerberg.com/careers.\n\n#LI-Remote #LI-Hybrid #LI-Onsite \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Design, Amazon, Recruiter, Cloud and Ruby jobs that are similar:\n\n
$30,000 — $80,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nRedwood City, California, United States
๐ Please reference you found the job on Remote OK, this helps us get more companies to post here, thanks!
When applying for jobs, you should NEVER have to pay to apply. You should also NEVER have to pay to buy equipment which they then pay you back for later. Also never pay for trainings you have to do. Those are scams! NEVER PAY FOR ANYTHING! Posts that link to pages with "how to work online" are also scams. Don't use them or pay for them. Also always verify you're actually talking to the company in the job post and not an imposter. A good idea is to check the domain name for the site/email and see if it's the actual company's main domain name. Scams in remote work are rampant, be careful! Read more to avoid scams. When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.
Remote Manager AI System Infrastructure and MLOps Engineering
The Team\n\nThe AI/ML team is funding and building one of the largest computing systems dedicated to nonprofit life science research in the world. This new effort will provide the scientific community with access to predictive models of healthy and diseased cells, which will lead to groundbreaking new discoveries that could help researchers cure, prevent, or manage all diseases by the end of this century.\n\nAs a hands-on Manager of the AI System Infrastructure and MLOps Engineering team, you will be joining the AI/ML and Data Engineering team in CZI Central Tech, with the responsibility for the stability and scalable operations of our leading edge GPU Cloud Compute Cluster. This supports our AI Researchers in their development and training of state-of-the-art models in artificial intelligence and machine learning to solve important problems in the biomedical sciences aligned with CZIโs mission, contributing to greater understanding of human cell function.\nThe Opportunity\n\nAs the Engineering Manager of the AI Infrastructure and MLOps Engineering team, you will be responsible for a variety of MLOps and AI development projects that empower our AI Researchers and help to accelerate Biomedical research across the whole of the AI lifecycle. You will guide our AI Systems Infrastructure and MLOps efforts focused on our GPU Cloud Cluster operations, ensuring that our systems are highly utilized, performant, and stable. You will be working in collaboration with other members of our own AI Engineering team as well as the Science Initiativeโs AI Research team as they iterate and train their deep learning code, optimizing systems operations and in helping to troubleshoot problems encountered by jobs running on the cluster.\nWhat You'll Do\n\n\n* Help to build out the MLOPs and Systems Infrastructure Engineering team, growing the team to support the large scale capacity systems and AI training efforts we will be undertaking.\n\n* Drive our MLOps processes and System Infrastructure Engineering efforts in ensuring that our GPU Cloud computing systems are highly utilized and stable, and proactively guide our team in implementing the instrumentation and observability tooling integral to our AI Platform.\n\n* Own the on-call efforts for our GPU Cloud computing systems, building out the MLOps and Systems Infrastructure Engineering alerting and monitoring efforts for our leading edge Kubernetes based AI platform, including troubleshooting problems encountered on the GPU platform infrastructure and with jobs running on the cluster and computing systems.\n\n* Responsibility for a variety of AI/ML development infrastructure, instrumentation, and telemetry projects that empower our team in supporting our users across the AI/ML lifecycle, taking a key role in simplifying and optimizing the systems and processes that are integral to our GPU Cloud Cluster operations - in an MLOps meets SRE kind of hybrid operations model.\n\n* Mentoring and managing your team in fulfilling their roles to the best of their abilities, provide skill and career coaching to help the team members keep growing along their own career and life paths, and keep the team engaged in meaningful and interesting projects in service of our north star philanthropic mission\n\n\n\nWhat You'll Bring\n\n\n* Hands-on AI/ML Model Training Platform Operations experience in an environment with challenging data and systems platform challenges\n\n* MLOps experience working with medium to large scale GPU clusters in Kubernetes, HPC environments, or large scale Cloud based ML deployments (Kubernetes Preferred)\n\n* BS, MS, or PhD degree in Computer Science or a related technical discipline or equivalent experience\n\n* 2+ years of experience managing MLOps teams\n\n* 7+ years of relevant coding and systems experience\n\n* 7+ years of relevant coding and systems experience\n\n* 7+ years of systems Architecture and Design experience, with a broad range of experience across Data, AI/ML, Core Infrastructure, and Security Engineering\n\n* Strong understanding of scaling containerized applications on Kubernetes or Mesos, including solid understanding of AI/ML training with containers using secure AMIs and continuous deployment systems that integrate with Kubernetes or Mesos. (Kubernetes preferred)\n\n* Proficiency with Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure, and experience with On-Prem and Colocation Service hosting environments\n\n* Solid coding ability with a systems language such as Rust,C/ C++, C#, Go, Java, or Scala\n\n* Extensive experience with a scripting language such as Python, PHP, or Ruby (Python Preferred)\n\n* Working knowledge of Nvidia CUDA and AI/ML custom libraries. \n\n* Knowledge of Linux systems optimization and administration\n\n* Understanding of Data Engineering, Data Governance, Data Infrastructure, and AI/ML execution platforms.\n\n* PyTorch, Karas, or Tensorflow experience a strong nice to have\n\n\n\nCompensation\n\nThe Redwood City, CA base pay range for this role is $214,000 - $321,000. New hires are typically hired into the lower portion of the range, enabling employee growth in the range over time. Actual placement in range is based on job-related skills and experience, as evaluated throughout the interview process. Pay ranges outside Redwood City are adjusted based on cost of labor in each respective geographical market. Your recruiter can share more about the specific pay range for your location during the hiring process.\nBenefits for the Whole You \n\nWeโre thankful to have an incredible team behind our work. To honor their commitment, we offer a wide range of benefits to support the people who make all we do possible. \n\n\n* CZI provides a generous 100% match on employee 401(k) contributions to support planning for the future. \n\n* Annual funding for employees that can be used most meaningfully for them and their families, such as housing, student loan repayment, childcare, commuter costs, or other life needs.\n\n* CZI Life of Service Gifts are awarded to employees to โlive the missionโ and support the causes closest to them.\n\n* Paid time off to volunteer at an organization of your choice. \n\n* Funding for select family-forming benefits. \n\n* Relocation support for employees who need assistance moving to the Bay Area\n\n* And more!\n\n\n\nCommitment to Diversity\n\nWe believe that the strongest teams and best thinking are defined by the diversity of voices at the table. We are committed to fair treatment and equal access to opportunity for all CZI team members and to maintaining a workplace where everyone feels welcomed, respected, supported, and valued. Learn about our diversity, equity, and inclusion efforts. \n\nIf youโre interested in a role but your previous experience doesnโt perfectly align with each qualification in the job description, we still encourage you to apply as you may be the perfect fit for this or another role.\n\nExplore our work modes, benefits, and interview process at www.chanzuckerberg.com/careers.\n\n#LI-Remote #LI-Hybrid #LI-Onsite \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Design, Amazon, Recruiter, Cloud and Ruby jobs that are similar:\n\n
$30,000 — $80,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nRedwood City, California, United States
๐ Please reference you found the job on Remote OK, this helps us get more companies to post here, thanks!
When applying for jobs, you should NEVER have to pay to apply. You should also NEVER have to pay to buy equipment which they then pay you back for later. Also never pay for trainings you have to do. Those are scams! NEVER PAY FOR ANYTHING! Posts that link to pages with "how to work online" are also scams. Don't use them or pay for them. Also always verify you're actually talking to the company in the job post and not an imposter. A good idea is to check the domain name for the site/email and see if it's the actual company's main domain name. Scams in remote work are rampant, be careful! Read more to avoid scams. When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.
This job post is closed and the position is probably filled. Please do not apply. Work for AHEAD and want to re-open this job? Use the edit link in the email when you posted the job!
๐ค Closed by robot after apply link errored w/ code 404 1 year ago
\nAHEAD builds platforms for digital business. By weaving together advances in cloud infrastructure, automation and analytics, and software delivery, we help enterprises deliver on the promise of digital transformation.\n\n\nAt AHEAD, we prioritize creating a culture of belonging, where all perspectives and voices are represented, valued, respected, and heard. We create spaces to empower everyone to speak up, make change, and drive the culture at AHEAD. \n\n\nWe are an equal opportunity employer, and do not discriminate based on an individual's race, national origin, color, gender, gender identity, gender expression, sexual orientation, religion, age, disability, marital status, or any other protected characteristic under applicable law, whether actual or perceived. \n\n\nWe embrace all candidates that will contribute to the diversification and enrichment of ideas and perspectives at AHEAD. \n\n\nNetwork Security Pre Sales Specialist \n\n\nThe AHEAD Network Security Specialist will be focused on the core tenets of Network and Network Security related solutions for AHEAD prospects and customers including a broad range of technologies. As a member of our specialist team, you will be considered an organizational thought leader for this marketplace.\n\n\nYou will work in partnership with multiple sales representatives to help build a cohesive account strategy for each existing and prospective client. As part of this strategy, you will help sales representatives identify potential technologies and vendors to partner with in a given account and assist in driving this strategy throughout a given campaign. \n\n\nTime management and prioritization are essential attributes to be successful. Your goal is to evolve into the technical thought leader for each of your clients. In this capacity, you will work to define strategy, compare and contrast alternative approaches, and build key relationships, as well as size and configure products. Positioning and leveraging subject matter experts and inside consultants is a necessity to scale your effectiveness within accounts. \n\n\nAHEAD is actively recruiting for this role and is looking for candidates who consider themselves as a โNext Generation Network / Datacenter Architect.โ A core tenet and base requirement for this position is understanding and mastery of Cisco Route/Switch and/or DataCenter technologies. Outside of this core tenet, our expectation is to attract, recruit and attain resources who have a desired to focus not only on traditional on-premise Cisco solutions, but also to be looking to career, technology and skill advancement by being actively willing to demonstrate skills and ability to learn new technologies including multiple SDN solutions, network security, load balancing, edge / acceleration, and cloud technologies. Understanding that the impetus and impact of Cloud computing is real and should be incorporated into solutions is key. Experience or eagerness to learn about and with network and system / access design with Amazon AWS, Microsoft Azure or other cloud services is a key requirement. \n \n\n\n\n\n\nRoles and Responsibilities\n* Participation, Leadership and Coordination of the AHEAD virtual team for Network Infrastructure\n* You will also be responsible for working with the AHEAD partner management team and directly with selected vendors to maintain vendor technical and sales relationships\n* As a Network Specialist you are a thought leader within the Engineering organization, responsible for designing, architecting, and, at times, consulting for Ahead clients\n* Your goal as a Network Lead should be to build skills, relationships, and industry credibility across many segments to drive product, services and consulting sales\n* Support the sales teams during engagements by providing advice and solutions or proposals optimized to meet customer technical and business requirements \n* Develops relationships with the account managers, partners and customers in support of sales team objectives and engages and leverages team specialist resources as appropriate \n* Strategizes and executes technical sales calls \n* Completes required pre-sales documentation (configurations, pricing, services, presentations, justifications etc.) quickly and accurately \n* Qualifies sales opportunities in the terms of customer technical requirements, decision making process and funding \n* Presents and markets the design and value of proposed Ahead solutions and business justification to customers, prospects and Ahead management \n* Participates in mentorship of entry-level team members \n* Possesses strong, detailed product / technology / industry knowledge. Knowledge of job associated software and applications \n\n\n\nBase Skills\n* Network Infrastructure Experience \n* Cisco certifications are required. Examples: CCIE, CCNP \n* Cisco Nexus 9K, ACI, 7K, 5K, 2K, 1K \n* Experience with Converged Network Fabrics / Converged Infrastructure Systems \n* IP Routing Protocols \n* Layer 2 protocols and technologies \n* Cisco Core Datacenter Technologies \n* ACI \n* Network Management \n* MPLS, DMVPN, Multicast \n* Firewalls, VPN, IPSec / L2TP etc \n* Network Security\n\n\n\nOther Skills\n* VMware / VMware NSX \n* Palo Alto\n* Arista\n* Fortinet\n* Automation skillet\n* F5 / NetScalar \n* Amazon AWS, Azure, Cloud connectivity and network fundamentals \n\n\n\nEducation and Experience\n* Bachelorโs degree in Computer Science, Engineering, or equivalent \n* Pre Sales and Data Center experience is required \n* Professional communication, presentation, analytical, and problem-solving skills. \n* Experience working as a technical lead in a pre-sales sales campaign \n* Ability to work under critical conditions and influence others to achieve results \n* Strong interpersonal skills with excellent presentation skills \n* Must be independent, self-motivated, a self-starter and possess a good working attitude \n* Able to work well within a team and partner environment \n* Customer focused and results driven \n\n\n\n\n\nWhy AHEAD:\n\n\nThrough our daily work and internal groups like Moving Women AHEAD and RISE AHEAD, we value and benefit from diversity of people, ideas, experience, and everything in between.\n\n\nWe fuel growth by stacking our office with top-notch technologies in a multi-million-dollar lab, by encouraging cross department training and development, sponsoring certifications and credentials for continued learning.\n\n\nWe understand that you have a life outside of work. Thatโs why we offer flexible paid time off, paid company holidays, and the ability for you to manage your work schedule as needed. \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Design, Amazon, Consulting, Cloud, Microsoft, Sales and Digital Nomad jobs that are similar:\n\n
$70,000 — $115,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nPhoenix, Arizona
# How do you apply?\n\nThis job post has been closed by the poster, which means they probably have enough applicants now. Please do not apply.