NVIDIA is looking for outstanding software engineers to work on NVIDIAโs Data Center GPU Manager (DCGM) software. In this role you will work closely with the broader NVIDIA team to design and build Linux-based management agents, CLI tools and end-to-end integration solutions that combine GPUs with the rest of the data center software management ecosystem. We are focused on supporting NVIDIA products across HPC, cloud and enterprise on both bare metal and virtualized platforms as the role of GPUs in all of these environments expands rapidly. Your contributions will span many aspects of GPU system integration, including telemetry and metrics, health checks, diagnostics, configuration, accounting and policy. These tools fill roles of both passive background monitoring and active online management with a core emphasis on operational transparency and seamless integration in customer environments. Your code will support single node developer systems through large clusters with thousands of nodes. To be successful, you will need to have a strong Linux C/C++ background, familiarity with distributed software development, and a proven work ethic. You will be expected to jump in quickly and provide important contributions from day one. This is a dynamic work environment with many exciting opportunities awaiting. NVIDIA GPUs are central to many hot trends in the enterprise, cloud and datacenter. Come join us as we craft the future of accelerated computing and AI! What you'll be doing: Develop robust, scalable C++ user space data center management system software under Linux Build and maintain user-space libraries, agents, plugins, bindings and CLI tools Enable GPU management integration with the OSS ecosystem, including Kubernetes and Docker Support internal and external users through bug fixes, documentation and feature improvements Maintain high quality products through robust test coverage and smart design What we need to see: BS or higher in Computer Science or equivalent experience. 5+ years of meaningful industry experience with a strong C++ development background Familiarity with modern C++ standards (C++17/C++20). User space development and debugging expertise under Linux environments Experience with APIs and interface design. Experience with IPC and Multi-threading Outstanding written and verbal interpersonal skills Strong motivation and commitment to learn new skills Ability to implement all aspects of the software development lifecycle Ability to manage time in a fast, heavily multitasked environment Experience writing unit and system tests to ensure the correctness of fixes and new features Ways to stand out from the crowd: Development experience with Python, Go, and Rust. Experience with Jenkins and GitHub/GitLab CI/CD pipelines. Experience with containers, common orchestration frameworks and common logging/telemetry backends Experience with APIs and interface design. Exposure to GPU programming with CUDA. Experience with enterprise software development. Experience with cross-language interfaces (FFI, swig, etc.) in Go (CGO), Python, and Rust. Experience with metrics gathering/monitoring best practices. Experience with Open Telemetry, Prometheus, Grafana, DataDog, etc. Good understanding of extensive distributed systems and data-center operations/limitations. NVIDIA is widely considered to be one of the technology worldโs most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. If you're creative and autonomous, we want to hear from you! The base salary range is 148,000 USD - 276,000 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. NVIDIA is a Learning Machine NVIDIA pioneered accelerated computing to tackle challenges no one else can solve. Our work in AI and the metaverse is transforming the world's largest industries and profoundly impacting society. Learn more about NVIDIA. \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Design, Docker, Cloud, Node, Senior and Engineer jobs that are similar:\n\n
$80,000 — $117,500/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nUS, WA, Redmond
๐ Please reference you found the job on Remote OK, this helps us get more companies to post here, thanks!
When applying for jobs, you should NEVER have to pay to apply. You should also NEVER have to pay to buy equipment which they then pay you back for later. Also never pay for trainings you have to do. Those are scams! NEVER PAY FOR ANYTHING! Posts that link to pages with "how to work online" are also scams. Don't use them or pay for them. Also always verify you're actually talking to the company in the job post and not an imposter. A good idea is to check the domain name for the site/email and see if it's the actual company's main domain name. Scams in remote work are rampant, be careful! Read more to avoid scams. When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.
\nChainSafe is a leading blockchain research and development firm specializing in infrastructure solutions for the decentralized web. Alongside its contributions to significant ecosystems such as Ethereum, Polkadot, Filecoin, and more, ChainSafe creates solutions for developers across the web3 space utilizing our expertise in gaming, bridging and decentralized storage. As part of the mission to build innovative products for users and better tooling for developers, ChainSafe embodies an open-source and community-oriented ethos.\n\nAt ChainSafe, youโll be part of a global remote team that believes in the community's vital importance and contributes to advancing humanity with open-source and decentralized technology. To learn more about ChainSafe, look at our website or check out our work on GitHub.\n\nAbout the role\n\nAs a DevOps Engineer SRE for the Infrastructure Team, you will play a vital role in defining and implementing best-practice strategies and guides to ensure the reliability, scalability, and performance of our infrastructure that supports the daily production activities across multiple blockchain ecosystems. This includes multiple cloud & bare metal service providers, based on our containerized stack across linux environments.\n\nYour expertise will contribute to the sophistication of blockchain applications and redefine the boundaries of what's possible within this emerging technological sphere. All work across ChainSafe will be open-source, ensuring expansive opportunities for deep contribution and collaborative efforts across various web3 blockchains and ecosystems.\nResponsibilities\n\nWhat you will be doing\n\n\n* Oversee and enhance the health, performance, and security of environments, servers, and applications across the entire technology stack, including various blockchain services and full nodes.\n\n\n* Engage in managing various global environments, considering resources and latency to their observed regions\n\n\n\n\n\n* Be on-call, able to respond promptly outside of business hours\n\n* Implement automation efforts around builds, deployment, and automatic scaling\n\n* Work directly with the development and support teams to resolve issues\n\n* Design and implement procedures related to ChainSafeโs infrastructure operations\n\n\n* Execute deployments and network upgrade\n\n* Run and improve the incident response program\n\n\n\n\n\n* Provide training and guidance for other members of the infrastructure team, ensuring round-the-clock node operation and incident response.\n\n* Document and communicate technical details via open-source documentation\n\n* Collaborate with various internal teams and the wider community to build, expand, and scale ChainSafeโs infrastructure architecture, by tapping into new trends and opportunities highlighted by internal data, blockchain research, and the wider blockchain ecosystem\n\n\n\nRequirements\n\n\n* Practical knowledge of at least one programming language (Go, TypeScript, Solidity, or Rust is a big plus)\n\n* Demonstrable experience with modern Infrastructure as Code (IaC) tools (Terraform, Helm, Ansible, etc), automating deployment, and best CI/CD practices and tools.\n\n* 3+ years of experience managing resources in either AWS, GCP, or Azure.\n\n* 3+ years of experience working with Linux.\n\n* 3+ years of experience with monitoring and alerting tools (DataDog, Grafana, Prometheus, etc.)\n\n* 3+ years of experience implementing distributed tracing, monitoring, and logging systems using OpenTelemetry Protocol\n\n* 3+ years of experience building and participating in incident response systems (PagerDuty, etc) and handling the emergency response to production environment failures.\n\n* Excellent communication skills with the ability to document and convey technical details clearly\n\n* Ability to work autonomously as well as with the wider team\n\n\n\nAs a plus:\n\n\n* Experience working in Web3 domain\n\n* Experience working with bare metal deployments\n\n* Experience automating network deployment\n\n* Understanding at least two of the following domains - Web Security, Web3 Security, Cloud Security, Systems Security, and Applied Cryptography.\n\n\n\nHiring Steps\n\n\n* Selected candidates will be invited to a 30โtoโ45โminute values interview with one or two of our team members\n\n* Technical 60-minute interview with one or two of our engineers.\n\n* Then, candidates will be asked to complete a homework assignment in under 3-4 hours.\n\n* Lastly, a 60-minute call with the hiring team to discuss the results and final interview.\n\n\n\n\nWhy Join ChainSafe\n\nFounded by developers for developers, ChainSafe is a remote-first company with an international team. We continue to provide opportunities for personal and professional growth, value autonomy and responsibility, have a results-driven environment, and offer flexible work hours.\n\nWe care deeply about our values and look for these attributes in every new team member. In addition, we recognize the benefits of cultivating a diverse team and aspire to embed respect for all people into our culture. We encourage women, the LGBTQIA+ community, people of colour, and members of any other group underrepresented in the blockchain space (or tech in general) to apply.\n\nHow to Apply\n\nPlease fill out the Greenhouse application form below and ensure that you attach your resume and link your Github/Gitlab profile or any software project you have contributed to (if applicable). \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Web3, DevOps, Cloud, Node and Engineer jobs that are similar:\n\n
$70,000 — $110,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nVancouver, British Columbia, Canada
๐ Please reference you found the job on Remote OK, this helps us get more companies to post here, thanks!
When applying for jobs, you should NEVER have to pay to apply. You should also NEVER have to pay to buy equipment which they then pay you back for later. Also never pay for trainings you have to do. Those are scams! NEVER PAY FOR ANYTHING! Posts that link to pages with "how to work online" are also scams. Don't use them or pay for them. Also always verify you're actually talking to the company in the job post and not an imposter. A good idea is to check the domain name for the site/email and see if it's the actual company's main domain name. Scams in remote work are rampant, be careful! Read more to avoid scams. When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.
\nChainSafe is a leading blockchain research and development firm specializing in infrastructure solutions for the decentralized web. Alongside its contributions to significant ecosystems such as Ethereum, Polkadot, Filecoin, and more, ChainSafe creates solutions for developers across the web3 space utilizing our expertise in gaming, bridging and decentralized storage. As part of the mission to build innovative products for users and better tooling for developers, ChainSafe embodies an open-source and community-oriented ethos.\n\nAt ChainSafe, youโll be part of a global remote team that believes in the community's vital importance and contributes to advancing humanity with open-source and decentralized technology. To learn more about ChainSafe, look at our website or check out our work on GitHub.\n\nAbout the role\n\nAs a DevOps Engineer SRE for the Infrastructure Team, you will play a vital role in defining and implementing best-practice strategies and guides to ensure the reliability, scalability, and performance of our infrastructure that supports the daily production activities across multiple blockchain ecosystems. This includes multiple cloud & bare metal service providers, based on our containerized stack across linux environments.\n\nYour expertise will contribute to the sophistication of blockchain applications and redefine the boundaries of what's possible within this emerging technological sphere. All work across ChainSafe will be open-source, ensuring expansive opportunities for deep contribution and collaborative efforts across various web3 blockchains and ecosystems.\nResponsibilities\n\nWhat you will be doing\n\n\n* Oversee and enhance the health, performance, and security of environments, servers, and applications across the entire technology stack, including various blockchain services and full nodes.\n\n\n* Engage in managing various global environments, considering resources and latency to their observed regions\n\n\n\n\n\n* Be on-call, able to respond promptly outside of business hours\n\n* Implement automation efforts around builds, deployment, and automatic scaling\n\n* Work directly with the development and support teams to resolve issues\n\n* Design and implement procedures related to ChainSafeโs infrastructure operations\n\n\n* Execute deployments and network upgrade\n\n* Run and improve the incident response program\n\n\n\n\n\n* Provide training and guidance for other members of the infrastructure team, ensuring round-the-clock node operation and incident response.\n\n* Document and communicate technical details via open-source documentation\n\n* Collaborate with various internal teams and the wider community to build, expand, and scale ChainSafeโs infrastructure architecture, by tapping into new trends and opportunities highlighted by internal data, blockchain research, and the wider blockchain ecosystem\n\n\n\nRequirements\n\n\n* Practical knowledge of at least one programming language (Go, TypeScript, Solidity, or Rust is a big plus)\n\n* Demonstrable experience with modern Infrastructure as Code (IaC) tools (Terraform, Helm, Ansible, etc), automating deployment, and best CI/CD practices and tools.\n\n* 3+ years of experience managing resources in either AWS, GCP, or Azure.\n\n* 3+ years of experience working with Linux.\n\n* 3+ years of experience with monitoring and alerting tools (DataDog, Grafana, Prometheus, etc.)\n\n* 3+ years of experience implementing distributed tracing, monitoring, and logging systems using OpenTelemetry Protocol\n\n* 3+ years of experience building and participating in incident response systems (PagerDuty, etc) and handling the emergency response to production environment failures.\n\n* Excellent communication skills with the ability to document and convey technical details clearly\n\n* Ability to work autonomously as well as with the wider team\n\n\n\nAs a plus:\n\n\n* Experience working in Web3 domain\n\n* Experience working with bare metal deployments\n\n* Experience automating network deployment\n\n* Understanding at least two of the following domains - Web Security, Web3 Security, Cloud Security, Systems Security, and Applied Cryptography.\n\n\n\nHiring Steps\n\n\n* Selected candidates will be invited to a 30โtoโ45โminute values interview with one or two of our team members\n\n* Technical 60-minute interview with one or two of our engineers.\n\n* Then, candidates will be asked to complete a homework assignment in under 3-4 hours.\n\n* Lastly, a 60-minute call with the hiring team to discuss the results and final interview.\n\n\n\n\nWhy Join ChainSafe\n\nFounded by developers for developers, ChainSafe is a remote-first company with an international team. We continue to provide opportunities for personal and professional growth, value autonomy and responsibility, have a results-driven environment, and offer flexible work hours.\n\nWe care deeply about our values and look for these attributes in every new team member. In addition, we recognize the benefits of cultivating a diverse team and aspire to embed respect for all people into our culture. We encourage women, the LGBTQIA+ community, people of colour, and members of any other group underrepresented in the blockchain space (or tech in general) to apply.\n\nHow to Apply\n\nPlease fill out the Greenhouse application form below and ensure that you attach your resume and link your Github/Gitlab profile or any software project you have contributed to (if applicable). \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Web3, DevOps, Cloud, Node and Engineer jobs that are similar:\n\n
$70,000 — $110,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nVancouver, British Columbia, Canada
๐ Please reference you found the job on Remote OK, this helps us get more companies to post here, thanks!
When applying for jobs, you should NEVER have to pay to apply. You should also NEVER have to pay to buy equipment which they then pay you back for later. Also never pay for trainings you have to do. Those are scams! NEVER PAY FOR ANYTHING! Posts that link to pages with "how to work online" are also scams. Don't use them or pay for them. Also always verify you're actually talking to the company in the job post and not an imposter. A good idea is to check the domain name for the site/email and see if it's the actual company's main domain name. Scams in remote work are rampant, be careful! Read more to avoid scams. When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.
This job post is closed and the position is probably filled. Please do not apply. Work for Subspace Labs and want to re-open this job? Use the edit link in the email when you posted the job!
๐ค Closed by robot after apply link errored w/ code 404 2 years ago
\nWho We Are\n\n\nSubspace Network is building a radically decentralized, next-generation blockchain which allows developers to easily run Web3 apps at Internet scale. Subspace is based on original research funded by the US National Science Foundation and planning to launch its Network later this year. Subspace Labs is an early-stage, venture-backed startup with a remote-first, globally distributed team. To learn more, visit our website, view our team handbook, and read the technical whitepaper.\n\n\nWe are seeking a DevOps Engineer to join our rapidly growing team of Blockchain and Cryptocurrency enthusiasts and engineers.\n\n\n\nIn this role you will be responsible for:\n* Operating and maintaining of backend blockchain infrastructure of the Subspace Network on top of bare metal services, VMs and Kubernetes clusters.\n* Following principles of infrastructure as code to make operations well defined and maintainable.\n* Ensuring reliable operation through monitoring and observability.\n* Educating and enabling the rest of engineering team with automation of CI/CD and other development workflows.\n* Sustaining good site reliability practices with incident response and postmortem analysis\n\n\n\nKey Requirements:\n* Rich experience working with Linux servers, networking, virtualization and containerization technologies.\n* Experience setting up and operating production Kubernetes clusters, virtual machines and Linux bare metal servers.\n* Familiarity with cloud-native landscape of tooling and experience of successfully deploying them to production environments (Terraform, Ansible, Prometheus/VictoriaMetrics+Grafana, ArgoCD, etc.).\n* Solid understanding of blockchains and successful experience of maintaining of blockchain node (bootstrap, RPC, validator) infrastructure.\n* Be proactive, willing to help engineers so stay productive in fully-remote setting.\n* Knowledge of at least one programming language.\n* A passion for decentralized, peer-to-peer systems and Web3 technologies.\n\n\n\nBonus Experience:\n* Familiarity with the Rust and TypeScript languages and their tooling.\n* Familiarity with Substrate and the Polkadot ecosystem.\n* Active participation in Open Source development of DevOps/SRE-related tooling, presenting at conferences and meetups.\n\n\n\nWhat We Offer:\n* A globally distributed work environment with a high degree of autonomy and agency.\n* You will play a critical role in implementing a new layer one blockchain.\n* Salary and options befitting an early hire at a venture-backed startup.\n* Medical, dental, and vision insurance (US-based only).\n* Company-sponsored team offsites in various locations around the world.\n\n\n\n\n \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Web3, DevOps, Typescript, Node, Engineer, Linux and Backend jobs that are similar:\n\n
$70,000 — $120,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nInternational
# How do you apply?\n\nThis job post has been closed by the poster, which means they probably have enough applicants now. Please do not apply.