NVIDIA is looking for outstanding software engineers to work on NVIDIAโs Data Center GPU Manager (DCGM) software. In this role you will work closely with the broader NVIDIA team to design and build Linux-based management agents, CLI tools and end-to-end integration solutions that combine GPUs with the rest of the data center software management ecosystem. We are focused on supporting NVIDIA products across HPC, cloud and enterprise on both bare metal and virtualized platforms as the role of GPUs in all of these environments expands rapidly. Your contributions will span many aspects of GPU system integration, including telemetry and metrics, health checks, diagnostics, configuration, accounting and policy. These tools fill roles of both passive background monitoring and active online management with a core emphasis on operational transparency and seamless integration in customer environments. Your code will support single node developer systems through large clusters with thousands of nodes. To be successful, you will need to have a strong Linux C/C++ background, familiarity with distributed software development, and a proven work ethic. You will be expected to jump in quickly and provide important contributions from day one. This is a dynamic work environment with many exciting opportunities awaiting. NVIDIA GPUs are central to many hot trends in the enterprise, cloud and datacenter. Come join us as we craft the future of accelerated computing and AI! What you'll be doing: Develop robust, scalable C++ user space data center management system software under Linux Build and maintain user-space libraries, agents, plugins, bindings and CLI tools Enable GPU management integration with the OSS ecosystem, including Kubernetes and Docker Support internal and external users through bug fixes, documentation and feature improvements Maintain high quality products through robust test coverage and smart design What we need to see: BS or higher in Computer Science or equivalent experience. 5+ years of meaningful industry experience with a strong C++ development background Familiarity with modern C++ standards (C++17/C++20). User space development and debugging expertise under Linux environments Experience with APIs and interface design. Experience with IPC and Multi-threading Outstanding written and verbal interpersonal skills Strong motivation and commitment to learn new skills Ability to implement all aspects of the software development lifecycle Ability to manage time in a fast, heavily multitasked environment Experience writing unit and system tests to ensure the correctness of fixes and new features Ways to stand out from the crowd: Development experience with Python, Go, and Rust. Experience with Jenkins and GitHub/GitLab CI/CD pipelines. Experience with containers, common orchestration frameworks and common logging/telemetry backends Experience with APIs and interface design. Exposure to GPU programming with CUDA. Experience with enterprise software development. Experience with cross-language interfaces (FFI, swig, etc.) in Go (CGO), Python, and Rust. Experience with metrics gathering/monitoring best practices. Experience with Open Telemetry, Prometheus, Grafana, DataDog, etc. Good understanding of extensive distributed systems and data-center operations/limitations. NVIDIA is widely considered to be one of the technology worldโs most desirable employers. We have some of the most forward-thinking and hardworking people on the planet working for us. If you're creative and autonomous, we want to hear from you! The base salary range is 148,000 USD - 276,000 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. NVIDIA is a Learning Machine NVIDIA pioneered accelerated computing to tackle challenges no one else can solve. Our work in AI and the metaverse is transforming the world's largest industries and profoundly impacting society. Learn more about NVIDIA. \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Design, Docker, Cloud, Node, Senior and Engineer jobs that are similar:\n\n
$80,000 — $117,500/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nUS, WA, Redmond
๐ Please reference you found the job on Remote OK, this helps us get more companies to post here, thanks!
When applying for jobs, you should NEVER have to pay to apply. You should also NEVER have to pay to buy equipment which they then pay you back for later. Also never pay for trainings you have to do. Those are scams! NEVER PAY FOR ANYTHING! Posts that link to pages with "how to work online" are also scams. Don't use them or pay for them. Also always verify you're actually talking to the company in the job post and not an imposter. A good idea is to check the domain name for the site/email and see if it's the actual company's main domain name. Scams in remote work are rampant, be careful! Read more to avoid scams. When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.
\nABOUT EIGENLABS\n\nEigenLabs provides cryptoeconomic security as a service for blockchain projects. Our platform provides programmatic access to the trust layer of Ethereum to make components reusable, allowing builders to rely on Ethereum for security while saving time and resources typically allocated to bootstrapping their own token. EigenLabs technologies also create technical value & efficiency for validators in Ethereumโs proof-of-stake network by enabling them to "re-stake" their assets. Re-staking allows validators to secure multiple protocols and boost rewards by providing security services to oracle networks, data availability networks and dApps running on rollups or side-chains. Weโre on a mission to hyperscale ETH & dApps and increase decentralization through re-staking!\n\nTHE ROLE\n\nAt EigenLabs, our engineering teams operate within a customer-aligned team structure, where we believe in shared responsibility for DevOps among all software engineers. As an Infrastructure Software Engineer, you will lead the design and implementation of critical infrastructure components while fostering a culture of collaboration and best practices adoption within your team, and across the company. \n\nYou will work closely with frontend and backend software engineers, leveraging your expertise to empower the entire team to collectively maintain, operate, and enhance infrastructure resources effectively. By championing leading practices, automation, and efficient workflows, you will ensure that our customer-aligned teams can deliver high-quality software solutions with speed, security, and reliability.\n\nYour leadership in infrastructure design, implementation, and reliability engineering will directly contribute to EigenLabs' commitment to blockchain innovation and advancing economic freedom through technology. If youโre passionate about shaping the future of blockchain, and making a meaningful impact on the world, we look forward to hearing from you!\n\nWHAT YOU WILL DO\n\n\n* You will lead the design and implementation of foundational infrastructure components used by every engineering team in production. This includes dynamic configuration, DNS and networking setup, secrets management, container orchestration (e.g., Kubernetes, including platforms such as EKS).\n\n* Demonstrate leadership by taking the initiative to address challenges and drive continuous improvement. Foster a culture of best practices adoption and automation within your team. Share your expertise to empower frontend and backend engineers to efficiently maintain, operate, and enhance infrastructure resources.\n\n* Utilize your proficiency in cloud infrastructure platforms such as AWS, GCP, or similar to optimize our infrastructure for scalability and performance.\n\n* Collaborate with cross-functional teams to identify and implement improvements in infrastructure, monitoring, and incident response.\n\n* Make significant contributions to operational excellence initiatives, ensuring the highest level of efficiency and reliability in our infrastructure. Administer network capabilities and support CI/CD pipelines. Monitor infrastructure using tools like Prometheus, Grafana, or similar. \n\n* Implement and maintain security best practices across all aspects of our infrastructure, including access controls, encryption, and network security\n\n* Participate in on-call rotations to ensure the continued reliability and uptime of our services\n\n* Design, implement and champion Continuous Delivery (CI/CD) principles to automate software development and deployment processes.\n\n* Articulate a long-term vision for maintaining and scaling our infrastructure, aligning it with our product and technical goals.\n\n* Build tools for blockchain node operators that make it easy to launch and operate different types of validator environments\n\n* Proactively contribute to discussions about technical issues, sprint and roadmap planning, and improving engineering processes\n\n\n\n\nWHAT YOU WILL BRING\n\n\n* 5+ years of direct experience in infrastructure, SRE or back-end engineering with public cloud and Linux-based systems.\n\n* Strong design and implementation experience with at least one major cloud platform (AWS preferred)\n\n* Knowledge of authentication, authorisation and accounting (IAM, Federation, RBAC, service accounts) for public cloud and kubernetes\n\n* Containerization and container orchestration with Docker and Kubernetes. Experience with container hardening and implementation within Kubernetes.\n\n* Proficiency in at least one back-end development language such as Python, Go, or C++\n\n* Expertise in Infrastructure as Code (IaC) tools such as Terraform (preferred), Ansible, etc., enabling efficient automation and configuration management.\n\n* Demonstrated experience in production environments with proactive monitoring, logging, alerting, and profiling with tools such as Prometheus, Grafana, ELK Stack (or alternative log analysis platforms) to ensure robust performance monitoring and troubleshooting capabilities.\n\n* Experience developing and maintaining CI systems (e.g., GitHub Actions, Jenkins, or CircleCI) to facilitate automation of the SDLC. Working knowledge of CD tooling (e.g., AWS Code Pipelines, ArgoCD, Flux), and an understanding of how GitOps deployment models operate\n\n* 3+ years experience with one or more programming languages, preferably Go or Python.\n\n* Knowledge of security best practices, encompassing encryption methods, key management, access control mechanisms, and network security protocols that contribute to the overall security posture of our infrastructure. \n\n* Strong understanding of core Internet protocols: DNS, TCP/SSL, HTTP, gRPC, etc.\n\n* Proven ability to take ownership of projects and work independently. Proficiency in effectively communicating project statuses and diligently documenting activities. Expertise in maintaining meticulous attention to detail across diverse, blockchain-centric environments.\n\n* Track record of successfully delivering complex and high-scale infrastructure\n\n\n\n\nNICE TO HAVES\n\n\n* Contribution to open source projects and/or developing open source tools is advantage\n\n* Data visualization, and observability tooling experience\n\n* Expertise in scaling and migrating systems in dynamic environments, with a strong understanding of incident management processes.\n\n* Experience in information security, including vulnerability assessments, penetration testing, and implementing and automating security controls, with expertise in security protocols (TLS, SSL, SSH) and frameworks (NIST, CIS, ISO 27001). Experience responding to security audits. \n\n* Service discovery and service mesh technologies such as Istio, Linkerd.\n\n* Good understanding of blockchain fundamentals, including wallets, smart contracts, protocol design\n\n* Experience with Liquid Staking or Ethereum Node Operations platforms\n\n\n\n\n \n\nIn compliance with local law, we are disclosing the compensation, or a range thereof, for roles in locations where legally required. $225,000 - $250,000 is the annual base salary. Other rewards may include annual bonuses, short- and long-term incentives, and program-specific awards. In addition, EigenLabs provides various employee benefits, including: \n\n\n* Employer-covered Medical, Dental, and Vision plans\n\n* 401k \n\n* Unlimited Paid Time Off\n\n* 12 weeks of fully paid maternity and paternity leave \n\n\n \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Design, Ethereum, Docker, Accounting, DevOps, Cloud, Node, Senior, Engineer and Backend jobs that are similar:\n\n
$57,500 — $110,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nSeattle, Washington, United States
๐ Please reference you found the job on Remote OK, this helps us get more companies to post here, thanks!
When applying for jobs, you should NEVER have to pay to apply. You should also NEVER have to pay to buy equipment which they then pay you back for later. Also never pay for trainings you have to do. Those are scams! NEVER PAY FOR ANYTHING! Posts that link to pages with "how to work online" are also scams. Don't use them or pay for them. Also always verify you're actually talking to the company in the job post and not an imposter. A good idea is to check the domain name for the site/email and see if it's the actual company's main domain name. Scams in remote work are rampant, be careful! Read more to avoid scams. When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.
# What is the job?\n\nAs a DevOps Engineer at Radix, you will lovingly maintain our (your!) virtual infrastructure, while supporting the needs of our collection of community node runners to ensure smooth network operation in a variety of customer environments.\n\nYou will begin in familiar territory, performing care and feeding for dozens of virtual machines while gathering performance data and tuning for the CPU/memory/disk sweet spot which best balances performance and cost. Youโll plan and implement a network monitoring and alerting system, so that we know when things are going awry and can take action.\n\nAfter you have a handle on how to keep things running smoothly, you will serve as the primary point of contact with our community node runners, devising best practices for running Radix nodes on a variety of cloud platforms and bare metal servers, and working with our technical writer to fully document those practices. Acting as tier one support, you will respond to raised issues and track them to resolution, while learning from customer experiences to feed suggestions back to our internal DevOps, QA, and Network teams. You will help design, and participate in, our on-call rotation process, to ensure that someone from Radix is always available to investigate a disruption in service.\n\nAlong the way you will establish the operational rulebook on how environments are run at Radix, handle some sysadmin-adjacent problems relating to which employees can access what, and form lifelong bonds with a team of incredible people\n\n#What are we looking for?\n\n* You have maintained production systems on virtual infrastructure, and you possess a healthy collection of war stories from past disasters.\n\n* You have a wealth of knowledge about Docker which you never get to use at parties.\n\n* Youโre handy with an assortment of scripting languages, ideally Python or Bash.\n\n* You have a deep-seated need to automate things. The idea of doing a repeatable process manually is abhorrent to you.\n\n* Youโre patient when dealing with others. Youโre a good listener, and happy to be a teacher when needed.\n\n* You are a tenacious sleuth, able to persistently research and reason about difficult-to-reproduce problems until you have brought them to a satisfactory resolution.\n\n#What do you need?\n * Min 3 years experience in a DevOps role\n\n* Production experience with cloud providers such as AWS, GCP, and Azure (We use AWS)\n\n* Strong familiarity with Docker\n\n* Experience working with logging, monitoring and visualization tools such as Prometheus, Grafana, and Elastic Stack\n\n* Hands on experience with a scripting language\n\n* Comfortable with at least one infrastructure-as-code tool, such as Ansible, Terraform, or Puppet\n\n* Comfortable configuring and managing at least one popular Linux distribution.\n\n* Things That Will Really Help You Stand Out\n\n* Proven history of managing clustered/distributed environments\n\n* Proven history of node running at scale for any blockchain/distributed ledger\n\n* Have experience with Kubernetes (as well as Docker)\n\n#Who are we? \nAt Radix, we're a team of like-minded thinkers who have long been convinced that we're living in the earliest stages of a global financial revolution. This revolution is being fuelled by decentralized finance (or DeFi for short), which is enabling an assortment of pioneering developers and entrepreneurs to re-invent almost every financial product that is currently traded and invested in traditional markets, without requiring central authorities or siloed infrastructure. DeFi has captured a great deal of attention and investment in the crypto-aware niche, growing assets under management from $1 billion to $40 billion in less than a year. Impressive as its growth has been, its current market size isn't even a rounding error on the over $111 trillion held in traditional finance. We're focused on what it will take to go from billions to trillions.\n\nRadix went back to first principles to come up with the right technical solutionโthe first layer-one protocol built specifically for mainstream DeFiโand we have already tested out at over 1 million transactions per second. We're keenly aware that the need for an infinitely scalable platform is only one prerequisite among many for mass adoption, and we're also blazing new ground in the areas of purpose-built developer tools, user experiences, and regulatory integration.\nWe have forged a path deep into the future of what distributed ledger technology is going to look like and we need you to come and be part of the team that is making that happen right now.\n\nIf this job sounds like it was made for you, then please apply directly via the link or email [email protected] for more information. \n\n \n\nPlease mention the words **SIEGE FRAME LOYAL** when applying to show you read the job post completely (#RMjE2LjczLjIxNi4xMjU=). This is a feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.\n\n \n\n#Salary and compensation\n
$60,000 — $120,000/year\n
\n\n#Location\nWorldwide
๐ Please reference you found the job on Remote OK, this helps us get more companies to post here, thanks!
When applying for jobs, you should NEVER have to pay to apply. You should also NEVER have to pay to buy equipment which they then pay you back for later. Also never pay for trainings you have to do. Those are scams! NEVER PAY FOR ANYTHING! Posts that link to pages with "how to work online" are also scams. Don't use them or pay for them. Also always verify you're actually talking to the company in the job post and not an imposter. A good idea is to check the domain name for the site/email and see if it's the actual company's main domain name. Scams in remote work are rampant, be careful! Read more to avoid scams. When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.
This job post is closed and the position is probably filled. Please do not apply. Work for VOSTROM and want to re-open this job? Use the edit link in the email when you posted the job!
\nDevOps Engineer– Emphasis on Linux / Docker / Node.js / Elasticsearch / MongoDB\nThe Opportunity:\nWe're looking for an experienced DevOps engineer based in Phoenix, AZ, Virginia Beach, VA or the Washington, DC metro area, however remote (tele) workers will be considered for the position also if you have excellent communication skills and are willing to travel to one of the above locations several times per year.\nThe Day to Day:\n* Provide operational support and automation tools to application developers \n* Bridge the gap between development and operations to ensure successful delivery of projects \n* Participate as a member of the application development team \n* Build back-end frameworks that are maintainable, flexible and scaleable\n* Operate and scale the application back-end including the database clusters \n* Anticipate tomorrow's problems by understanding what users are trying to accomplish today \n\n\nRequirements:\n* DevOps experience with Linux or FreeBSD \n* Experience with Linux Containers and Docker \n* Configuration management experience, Salt Stack preferred \n* Exposure to the deployment and operations of node.js applications \n* Experience operating and optimizing Elasticsearch at large scale\n* Operational experience with Hadoop, MongoDB, Redis, Cassandra, or other distributed big data systems \n* Experience with any of JavaScript, Python, Ruby, Perl and/or shell scripting \n* Comfort with compute clusters and many terabytes of data \n* US Citizenship / Work Authorization\n\n\nBonus Points:\n* Development experience with Node.js or other HTTP backend tools\n* Mac OS X familiarity \n* BS or MS in a technology or scientific field of study\n* High energy level and pleasant, positive attitude!\n* Evidence of working well within a diverse team\n\n\nCompensation:\n* Salary commensurate with experience, generally higher than competitive industries\n* Comprehensive benefits package\n* Opportunities for advancement and a clear career path\n\n\nAbout Us:\nWe conduct advanced technical research and develop innovative software and systems that help meet network security and reliability challenges for organizations world-wide. You can read more at our web site. \nCareer Opportunities:\nWe have many other openings available. For a complete listing, visit jobs.vostrom.com \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to DevOps, JavaScript, InfoSec, Elasticsearch, Java, Perl, Python, Node, Ruby, Admin, Excel, Engineer, Sys Admin, Cassandra, Backend, Design, Docker, Digital Nomad, Travel and Linux jobs that are similar:\n\n
$70,000 — $120,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
# How do you apply?\n\nThis job post has been closed by the poster, which means they probably have enough applicants now. Please do not apply.