Remote Senior Site Reliability Engineer ML Platforms
Are you passionate about building and maintaining large-scale production systems that support advanced data science and machine learning applications? Do you want to join a team at the heart of NVIDIA's data-driven decision-making culture? If so, we have a great opportunity for you! NVIDIA is seeking a Senior Site Reliability Engineer (SRE) for the Data Science & ML Platform(s) team. The role involves designing, building, and maintaining services that enable real-time data analytics, streaming, data lakes, observability and ML/AI training and inferencing. The responsibilities include implementing software and systems engineering practices to ensure high efficiency and availability of the platform, as well as applying SRE principles to improve production systems and optimize service SLOs. Additionally, collaboration with our customers to plan implement changes to the existing system, while monitoring capacity, latency, and performance is part of the role. To succeed in this position, a strong background in SRE practices, systems, networking, coding, capacity management, cloud operations, continuous delivery and deployment, and open-source cloud enabling technologies like Kubernetes and OpenStack is required. Deep understanding of the challenges and standard methodologies of running large-scale distributed systems in production, solving complex issues, automating repetitive tasks, and proactively identifying potential outages is also necessary. Furthermore, excellent communication and collaboration skills, and a culture of diversity, intellectual curiosity, problem solving, and openness are essential. As a Senior SRE at NVIDIA, you will have the opportunity to work on innovative technologies that power the future of AI and data science, and be part of a dynamic and supportive team that values learning and growth. The role provides the autonomy to work on meaningful projects with the support and mentorship needed to succeed, and contributes to a culture of blameless postmortems, iterative improvement, and risk-taking. If you are seeking an exciting and rewarding career that makes a difference, we invite you to apply now! What youโll be doing: Develop software solutions to ensure reliability and operability of large-scale systems supporting machine-critical use cases. Gain a deep understanding of our system operations, scalability, interactions, and failures to identify improvement opportunities and risks. Create tools and automation to reduce operational overhead and eliminate manual tasks. Establish frameworks, processes, and standard methodologies to enhance operational maturity, team efficiency, and accelerate innovation. Define meaningful and actionable reliability metrics to track and improve system and service reliability. Oversee capacity and performance management to facilitate infrastructure scaling across public and private clouds globally. Build tools to improve our service observability for faster issue resolution. Practice sustainable incident response and blameless postmortems What we need to see: Minimum of 10 years of experience in SRE, Cloud platforms, or DevOps with large-scale microservices in production environments. Master's or Bachelor's degree in Computer Science or Electrical Engineering or CE or equivalent experience. Strong understanding of SRE principles, including error budgets, SLOs, and SLAs. Proficiency in incident, change, and problem management processes. Skilled in problem-solving, root cause analysis, and optimization. Experience with streaming data infrastructure services, such as Kafka and Spark. Expertise in building and operating large-scale observability platforms for monitoring and logging (e.g., ELK, Prometheus). Proficiency in programming languages such as Python, Go, Perl, or Ruby. Hands-on experience with scaling distributed systems in public, private, or hybrid cloud environments. Experience in deploying, supporting, and supervising services, platforms, and application stacks. Ways to stand out from the crowd: Experience operating large-scale distributed systems with strong SLAs. Excellent coding skills in Python and Go and extensive experience in operating data platforms. Knowledge of CI/CD systems, such as Jenkins and GitHub Actions. Familiarity with Infrastructure as Code (IaC) methodologies and tools. Excellent interpersonal skills for identifying and communicating data-driven insights. NVIDIA leads the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions, from artificial intelligence to autonomous cars. NVIDIA is looking for exceptional people like you to help us accelerate the next wave of artificial intelligence. The base salary range is 224,000 USD - 425,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. NVIDIA is the world leader in accelerated computing. NVIDIA pioneered accelerated computing to tackle challenges no one else can solve. Our work in AI and digital twins is transforming the world's largest industries and profoundly impacting society. Learn more about NVIDIA. \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Python, DevOps, Cloud, Senior and Engineer jobs that are similar:\n\n
$60,000 — $135,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nUS, CA, Santa Clara
๐ Please reference you found the job on Remote OK, this helps us get more companies to post here, thanks!
When applying for jobs, you should NEVER have to pay to apply. You should also NEVER have to pay to buy equipment which they then pay you back for later. Also never pay for trainings you have to do. Those are scams! NEVER PAY FOR ANYTHING! Posts that link to pages with "how to work online" are also scams. Don't use them or pay for them. Also always verify you're actually talking to the company in the job post and not an imposter. A good idea is to check the domain name for the site/email and see if it's the actual company's main domain name. Scams in remote work are rampant, be careful! Read more to avoid scams. When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.
Remote Senior DevOps Lead Cloud & Autonomous System
\nAbout Cyngn \nBased in Menlo Park, CA, Cyngn is a publicly traded autonomous vehicle company. Whether at a warehouse floor, mine, or construction site, our self-driving technology can be deployed in various commercial domains across various vehicle form factors. To build this emergent technology, we seek innovative, motivated, and experienced leaders to join our team and move this field forward. If you like to build, tinker, and create with a team of trusted and passionate colleagues, then Cyngn is the place for you. Key reasons to join Cyngn: \n\n\nWe are Small and Big. \nWith under 100 employees, Cyngn is still a company that operates with the energy of a startup. On the other hand, we are publicly traded. Combined, our employees not only work in close-knit teams with close mentorship from company leaders, but they also get access to the liquidity of our publicly traded equity. This gives our small team the opportunity to make a big impact in industries that other people arenโt touchingโwithout taking on the risks associated with untested organizations. \n\n\nWe Build Today and Deploy Tomorrow. \nOur employees arenโt just researchers but are creating reality. In other words, the autonomous vehicles weโre building are designed to go to real clients right away. We are driven by our passion for innovation, our ability to see the entire product, and the real impact of our work in the real world. At Cyngn, the distance between the theoretical and the actual is razor-thin. \n\n\nWe arenโt robots. We just build them. \nRead our Glassdoor reviews, and youโll find that one of the best things about working here is the people. We are an inclusive, diverse team of top talent with exceptional synergy. We thrive on open collaboration and a trusting and creative work environment that is fueled by our passion for the industry. At Cyngn, everyoneโs voice is valued, and each of our unique perspectives is celebrated. Itโs the people that allow our company to continue to grow bigger and better every day.\n\n\n\n\nAbout this Role:\nAs a Senior DevOps Lead at Cyngn, you will play a vital role in architecting and managing infrastructure across cloud and autonomous vehicle systems. This position combines traditional cloud DevOps leadership with specialized expertise in robotics and autonomous systems infrastructure. You will bridge the gap between cloud operations and edge computing while leading a team of DevOps engineers to build and maintain scalable, reliable infrastructure for our autonomous vehicle platform.\n\n\n\nWhat you will do in this role\n* Lead and architect cloud and vehicle infrastructure initiatives across AWS and ROS/Linux environments \n* Design and implement scalable solutions for both cloud services and autonomous vehicle systems \n* Establish and maintain DevOps best practices, CI/CD pipelines, and infrastructure as code \n* Drive observability, monitoring, and incident response strategies \n* Optimize performance and cost efficiency of cloud and edge computing resources \n* Mentor team members and foster a developer-friendly environment \n* Manage on-call rotations and incident response processes \n* Architect solutions for processing and storing large-scale vehicle telemetry data \n* Lead security initiatives and compliance efforts across infrastructure \n* Design and implement solutions for both cloud services and autonomous vehicle systems \n* Optimize system performance for real-time processing of high-bandwidth sensor data \n* Develop and maintain documentation for system architecture and integration procedures \n\n\n\nWho you are\n* 10+ years of relevant DevOps/Infrastructure experience\n* Proven track record as a technical lead in platform or infrastructure teams\n* Advanced expertise in AWS services, infrastructure as code (Terraform), and Kubernetes\n* Strong experience with service mesh (Istio) and Helm/Kustomize\n* Deep understanding of ROS/ROS2 and Linux kernel configurations\n* Experience with GPU configurations and ML infrastructure\n* Expertise in ARM and NVIDIA CUDA platform configurations\n* Strong programming skills in Python and shell scripting\n* Experience with infrastructure automation (Ansible)\n* Expertise in CI/CD tools (Jenkins, GitHub Actions)\n* Strong system architecture and design skills\n* Excellence in technical documentation\n* Outstanding problem-solving abilities\n* Strong leadership and mentoring capabilities\n\n\n\nNice to haves\n* Experience with autonomous vehicle systems\n* Track record of optimizing GPU-based ML infrastructure\n* Experience with large-scale IoT deployments\n* Contributions to open-source projects\n* Experience with real-time systems and low-latency requirements\n* Expertise in security implementations including SSO, IdP, and AWS Cognito\n* Experience with JFrog artifactory and container registry management\n* Proficiency in AWS IoT Greengrass\n* Experience with container resource management on edge devices\n* Understanding of CPU affinity and priority scheduling\n* Track record of implementing cost optimization strategies\n* Experience with scaling systems both horizontally and vertically\n\n\n\nBenefits & Perks\n* Health benefits (Medical, Dental, Vision, HSA and FSA (Health & Dependent Daycare), Employee Assistance Program, 1:1 Health Concierge)\n* Life, Short-term, and long-term disability insurance (Cyngn funds 100% of premiums)\n* Company 401(k)\n* Commuter Benefits\n* Flexible vacation policy\n* Stock options for all full-time employees\n* Sabbatical leave opportunity after five years with the company\n* Paid Parental Leave\n* Daily lunches for in-office employees and fully stocked kitchen with snacks and beverages\n* Monthly meal and tech allowances for remote employees\n\n\n\n\n\n$180,000 - $240,000 a year\n \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Design, Python, DevOps, Cloud, Senior and Linux jobs that are similar:\n\n
$45,000 — $75,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nMenlo Park, CA
๐ Please reference you found the job on Remote OK, this helps us get more companies to post here, thanks!
When applying for jobs, you should NEVER have to pay to apply. You should also NEVER have to pay to buy equipment which they then pay you back for later. Also never pay for trainings you have to do. Those are scams! NEVER PAY FOR ANYTHING! Posts that link to pages with "how to work online" are also scams. Don't use them or pay for them. Also always verify you're actually talking to the company in the job post and not an imposter. A good idea is to check the domain name for the site/email and see if it's the actual company's main domain name. Scams in remote work are rampant, be careful! Read more to avoid scams. When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.
About Phaidra\n\nPhaidra is building the future of industrial automation.\n\nThe world today is filled with static, monolithic infrastructure. Factories, power plants, buildings, etc. operate the same they've operated for decades โ because the controls programming is hard-coded. Thousands of lines of rules and heuristics that define how the machines interact with each other. The result of all this hard-coding is that facilities are frozen in time, unable to adapt to their environment while their performance slowly degrades.\n\nPhaidra creates AI-powered control systems for the industrial sector, enabling industrial facilities to automatically learn and improve over time. Specifically:\n\n\n* We use reinforcement learning algorithms to provide this intelligence, converting raw sensor data into high-value actions and decisions.\n\n* We focus on industrial applications, which tend to be well-sensorized with measurable KPIs โ perfect for reinforcement learning.\n\n* We enable domain experts (our users) to configure the AI control systems (i.e. agents) without writing code. They define what they want their AI agents to do, and we do it for them.\n\n\n\n\nOur team has a track record of applying AI to some of the toughest problems. From achieving superhuman performance with DeepMind's AlphaGo, to reducing the energy required to cool Google's Data Centers by 40%, we deeply understand AI and how to apply it in production for massive impact.\n\nPhaidra is based in the USA but 100% remote; we do not have a physical office. We hire employees internationally with the help of our partner, OysterHR. Our team is currently located throughout the USA, Canada, UK, Norway, Italy, Spain, Portugal, and India.\n\n**Please only apply to one opening. If you are a better fit for another opening, our team will move your application. Candidates who apply to multiple openings will not be considered.**\nWho You Are\n\nWe are looking for a very experienced Software Engineer with a focus on MLOps tech leadership to be a part of our growing AI Platform team. You are bold and creative, and have deep empathy for customers. You will design and implement significant parts of the code base and will have the opportunity to make an immediate impact with your work and guide the product and team as we grow.\n\nYou are curious and like to understand technologies and their tradeoffs in depth - providing technical guidance to the team and peers as and when required. Leading by example, you have accumulated a wealth of insights and experiences from your hands-on involvement in the field, and you are committed to rolling up your sleeves and getting work done. You like joining and supporting other engineers in their work to learn from them as well as letting them benefit from your expertise and experience.\n\nYou have the motivation and skills to identify technical product needs, initiate projects and owning their delivery, including the involvement of engineering peers as needed. You are comfortable with challenging the status quo respectfully to drive and deliver technical excellence in the team.\n\n\n* We are seeking a team member located within one of the following areas: USA/Canada/UK/EU\n\n\n\nResponsibilities\n\nThe AI Platform team you are joining is responsible for building the core platform that powers model training, inference and decision making in our products. Furthermore the team owns MLOps and the services hosting our AI capabilities. Productionizing results from Research, as well as extending our systems and providing support according to our customer needs fall into team responsibilities as well. You will join this team as a very experienced engineer with a focus on MLOps solutions to grow our expertise in that area, but also contribute as a software engineer more widely in the team.\n\nAs an organization, we strongly believe in expertise across the stack. As such, you will experience flavors of Machine Learning, Software Engineering, Distributed Systems, MLOps and DevOps.\n\nIn particular, you will:\n\n\n* Design, build and lead the MLOps initiatives and vision for the AI Platform to strengthen automation, orchestration, versioning, observability, monitoring and collaboration for the platform.\n\n* Build and design scalable components for the AI Platform to allow high throughput training and inference for RL agents doing realtime inference for autonomous control of industrial systems.\n\n* Contribute to the design and implementation of the product backend by writing REST & gRPC API services and scalable event-driven backend applications.\n\n* Design clear, extensible software interfaces for the team's customers and maintain a high release quality bar.\n\n* Perform DevOps duties of CI/CD, Release & Deployment management.\n\n* Be a part of our global production oncall team and, own & operate your services in production, meeting Phaidraโs high bar for operational excellence.\n\n* Lead cross-functional initiatives collaborating with engineers, product managers and TPM across teams.\n\n* Mentor your peers and be a technical role-model in the team.\n\n\n\nOnboarding\n\nIn your first 30 daysโฆ\n\n\n* You will be immersed in an onboarding program that introduces you to Phaidra and our product.\n\n* You will spend time in the Engineering org, learning how the teams operate, interact, and approach problems.\n\n* You will read various parts of our handbook and familiarize yourself with the documentation culture at Phaidra.\n\n* You will set up your development environment and start working on an onboarding exercise that will introduce you to various parts of our code base.\n\n* You will learn about how we use agile and be able to navigate our sprint boards and backlogs.\n\n* You will learn about various team standards and development & release processes.\n\n* You will start to learn about our system architecture and infrastructure.\n\n* You will start picking up few good โfirst-tasksโ to get yourself accustomed to the end to end release flow.\n\n\n\n\nIn your first 60 daysโฆ\n\n\n* You will get a solid understanding of what Phaidra does and how we do it.\n\n* You will meet with team members across Phaidra and started building relationships that will help you be successful at your job.\n\n* You will complete the onboarding exercise and will be on your way to completing your first production task.\n\n* You will take ownership for the MLOps work on the team, identify gaps and propose roadmap items on the topic.\n\n\n\n\nIn your first 90 daysโฆ\n\n\n* You will be fully integrated in the team and with team members across the company.\n\n* You will have a more in-depth understanding of our system architecture and infrastructure.\n\n* You will complete your first on-call experience helping monitor and improve our production environments.\n\n* You will become an expert with our tooling.\n\n* You will start to contribute to knowledge sharing throughout Phaidra and the team.\n\n* You will take proactively drive MLOps topics in the team and represent it technically throughout the company.\n\n\n\nKey Qualifications\n\n\n* 10+ years of work experience.\n\n* Proven record on impact as a Tech Leader and bar-raiser for ambitious Software Engineering teams\n\n* Strong experience on designing and implementing MLOps solutions for AI production systems\n\n* Extensive experience with platform Software Engineering with the ability to contribute on all levels as an individual contributor and tech leader\n\n* Strong expertise on building, operating and monitoring large scale multi-tenant systems with high availability, fault tolerance, performance tuning, monitoring, and metrics collection\n\n* Ability to take ownership of realtime production systems - aligning technical with business requirements, raising the bar for operational excellence and on-call incident handling\n\n* Strong expertise in Python and Cloud environments\n\n* Very good grasp of Machine Learning (especially Deep Learning) fundamentals\n\n* Ability to collaborate and communicate effectively in an all-remote setting\n\n* Doing your work with curiosity, ownership, transparency & directness, outcome orientation, and customer empathy.\n\n\n\nBonus\n\n\n* Experience with building applications that can be deployed in cloud, as well as in hybrid or on-prem environment\n\n* Exposure to Reinforcement Learning or other in-depth knowledge on modern ML applications\n\n* Experience with industrial applications, industrial control systems, IoT, sensor time series applications, or similar\n\n\n\nRelevant Technologies from our Stack\n\n\n* Python, Go\n\n* PyTorch, PyTorch Lightning\n\n* Ray.io, Prefect, mlflow\n\n* REST & gRPC micro-services\n\n* Docker, Kubernetes, Terraform & Kapitan\n\n* GCP - GKE, PubSub, CloudSQL, BigTable, Postgres, etc.\n\n* Grafana Cloud, Prometheus\n\n* Poetry, Pants\n\n* Gitlab CI, ArgoCD, Atlantis\n\n\n\nGeneral Interview Process\n\nAll of our interviews are held via Google Meet, and an active camera connection is required.\n\n* \n\nMeeting with Operations (30 minutes): The purpose of this interview is to meet you, learn more about your background, discuss what you are looking for in a new position and cover formalities around your application.\n\n\n* \n\nTech Lead interview (60 minutes): This interview is a combination of technical and cultural fit assessment. We will cover your technical experience and the skills as an engineer and a tech lead while discussing projects that you have worked on in the past. You will meet the manager for the role as well as our VP of Engineering, with the opportunity to ask any questions about the team, role and engineering at Phaidra.\n\n\n* \n\nML system design & SRE (90 minutes): In this interview, we will go over a real world MLOps problem. You can expect to draw architecture diagrams using boxes & arrows in your browser. We will talk about system design, scalability and monitoring\n\n\n* \n\nML interview (60 minutes): This interview will focus on Machine Learning approaches, algorithms and theory. You will be asked about ML algorithms you are familiar with, how they work under the hood and how to use them in an applied setting.\n\n\n* \n\nCulture fit interview with Phaidraโs co-founders (30 minutes): This interview focuses on alignment with Phaidraโs values and the mutual cultural fit.\n\n\n\nBase Salary\n\n\n* US Residents: $156,000-$234,000/year\n\n* UK Residents: ยฃ108,000-ยฃ162,000/year\n\n\n\n\nSalary ranges for EU countries will vary based on the market rate for the location.\n\nThis position will also include equity.\n\nThese are best faith estimates of the base salary range for this position. Multiple factors such as experience, education, level, and location are taken into account when determining compensation.\nBenefits & Perks\n\n\n* Fast-paced and team-oriented environment where you will be instrumental in the direction of the company.\n\n* Phaidra is a 100% remote company with a digital nomad policy.\n\n* Competitive compensation & equity.\n\n* Outsized responsibilities & professional development.\n\n* Training is foundational; functional, customer immersion, and development training.\n\n* Medical, dental, and vision insurance (exact benefits vary by region).\n\n* Unlimited paid time off, with a minimum of 20 days off per year requirement.\n\n* Paid parental leave (exact benefits vary by region).\n\n* Home office setup allowance and company MacBook.\n\n* Monthly remote work stipend.\n\n\n\nOn being Remote\n\nWe are thoughtful about remote collaboration. We look to the pioneers - like Gitlab - for inspiration and best practices to create a stellar remote work environment. We have a documentation-first culture and actively practice asynchronous communication in everything we do. Our team stays connected through tools like Slack and video chat. Most teams meet daily, and we have dedicated all-hands meetings bi-weekly to build strong relationships. We hold virtual team building events once per month - and even hold virtual socials to watch rocket launches! We have a yearly in-person, all-company summit in locations like Seattle, Athens, Goa, and Barcelona.\nEqual Opportunity Employment\n\nPhaidra is an Equal Opportunity Employer; employment with Phaidra is governed on the basis of merit, competence, and qualifications and will not be influenced in any manner by race, color, religion, gender, national origin/ethnicity, veteran status, disability status, age, sexual orientation, gender identity, marital status, mental or physical disability, or any other legally protected status. We welcome diversity and strive to maintain an inclusive environment for all employees. If you need assistance with completing the application process, please contact us at [email protected].\nE-Verify Notice\n\nPhaidra participates in E-Verify, an employment authorization database provided through the U.S. Department of Homeland Security (DHS) and Social Security Administration (SSA). As required by law, we will provide the SSA and, if necessary, the DHS, with information from each new employeeโs Form I-9 to confirm work authorization for those residing in the United States.\n\nAdditional information about E-Verify can be found here.\n\n#LI-Remote\n\nWE DO NOT ACCEPT APPLICATIONS FROM RECRUITERS.\n\n \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Design, Python, DevOps, Cloud, API, Engineer and Backend jobs that are similar:\n\n
$70,000 — $105,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nSeattle, Washington, United States
๐ Please reference you found the job on Remote OK, this helps us get more companies to post here, thanks!
When applying for jobs, you should NEVER have to pay to apply. You should also NEVER have to pay to buy equipment which they then pay you back for later. Also never pay for trainings you have to do. Those are scams! NEVER PAY FOR ANYTHING! Posts that link to pages with "how to work online" are also scams. Don't use them or pay for them. Also always verify you're actually talking to the company in the job post and not an imposter. A good idea is to check the domain name for the site/email and see if it's the actual company's main domain name. Scams in remote work are rampant, be careful! Read more to avoid scams. When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.
Who You Are\n\nWe are looking for a driven Software Engineer (MLOps) to be a part of our growing AI Platform team. You are bold and creative, and have deep empathy for customers who may not be tech-savvy. You will design and implement significant parts of the code base and will have the opportunity to make an immediate impact with your work and guide the product and team as we grow.\n\nYou are curious and like to understand technologies and their tradeoffs in depth - providing technical guidance to the team and peers as and when required. Leading by example, you have accumulated a wealth of insights and experiences from your hands-on involvement in the field, and you are committed to rolling up your sleeves and getting work done. You like joining and supporting other engineers in their work to learn from them as well as letting them benefit from your expertise and experience.\n\nYou have the motivation and skills to identify technical product needs, initiate projects and owning their delivery, including the involvement of engineering peers as needed. You are comfortable with challenging the status quo respectfully to drive and deliver technical excellence in the team.\n\n**We are seeking a team member located within one of the following areas: USA/Canada/UK\nResponsibilities\n\nThe AI Platform team you are joining is responsible for building the core platform that powers model training, inference and decision making in our products. Furthermore the team owns MLOps and the services hosting our AI capabilities. Productionizing results from Research, as well as extending our systems and providing support according to our customer needs fall into team responsibilities as well. You will join this team as an experienced engineer with a focus on MLOps solutions to grow our expertise in that area, but also contribute as a software engineer more widely in the team.\n\nAs an organization, we strongly believe in expertise across the stack. As such, you will experience flavors of Machine Learning, Software Engineering, Distributed Systems, MLOps and DevOps.\n\nIn particular, you will:\n\n\n* Design, build and lead the MLOps initiatives and vision for the AI Platform to strengthen automation, orchestration, versioning, observability, monitoring and collaboration for the platform.\n\n* Build and design scalable components for the AI Platform to allow high throughput training and inference for RL agents doing realtime inference for autonomous control of industrial systems.\n\n* Contribute to the design and implementation of the product backend by writing REST & gRPC API services and scalable event-driven backend applications.\n\n* Design clear, extensible software interfaces for the team's customers and maintain a high release quality bar.\n\n* Design and optimize data storage & retrieval mechanisms for high throughput, security & ease of access.\n\n* Perform DevOps duties of CI/CD, Release & Deployment management.\n\n* Be a part of our global production oncall team and, own & operate your services in production, meeting Phaidraโs high bar for operational excellence.\n\n* Lead cross-functional initiatives collaborating with engineers, product managers and TPM across teams.\n\n* Mentor your peers and be a technical role-model in the team.\n\n\n\nOnboarding\n\nIn your first 30 daysโฆ\n\n\n* You will be immersed in an onboarding program that introduces you to Phaidra and our product.\n\n* You will spend time in the Engineering org, learning how the teams operate, interact, and approach problems.\n\n* You will read various parts of our handbook and familiarize yourself with the documentation culture at Phaidra.\n\n* You will set up your development environment and start working on an onboarding exercise that will introduce you to various parts of our code base.\n\n* You will learn about how we use agile and be able to navigate our sprint boards and backlogs.\n\n* You will learn about various team standards and development & release processes.\n\n* You will start to learn about our system architecture and infrastructure.\n\n* You will start picking up few good โfirst-tasksโ to get yourself accustomed to the end to end release flow.\n\n\n\n\nIn your first 60 daysโฆ\n\n\n* You will get a solid understanding of what Phaidra does and how we do it.\n\n* You will meet with team members across Phaidra and started building relationships that will help you be successful at your job.\n\n* You will complete the onboarding exercise and will be on your way to completing your first production task.\n\n* You will take ownership for the MLOps work on the team, identify gaps and propose roadmap items on the topic.\n\n\n\n\nIn your first 90 daysโฆ\n\n\n* You will be fully integrated in the team and with team members across the company.\n\n* You will have a more in-depth understanding of our system architecture and infrastructure.\n\n* You will complete your first on-call experience helping monitor and improve our production environments.\n\n* You will become an expert with our tooling.\n\n* You will start to contribute to knowledge sharing throughout Phaidra and the team.\n\n* You will take proactively drive MLOps topics in the team and represent it technically throughout the company.\n\n\n\nKey Qualifications\n\n\n* 7+ years of work experience.\n\n* Bachelors or Masters in Computer Science, or equivalent experience.\n\n* Strong experience on designing and implementing MLOps solutions for AI production systems\n\n* Expertise with production Software Engineering - relational and non-relational data modelling, micro-services, understanding of event driven systems, etc.\n\n* Strong experience building large scale multi-tenant systems with high availability, fault tolerance, performance tuning, monitoring, and statistics/metrics collection.\n\n* Strong expertise in Python and Cloud environments\n\n* Good grasp of Machine Learning (especially Deep Learning) fundamentals.\n\n* Ability to collaborate and communicate effectively in an all-remote setting\n\n* Doing your work with curiosity, ownership, transparency & directness, outcome orientation, and customer empathy.\n\n\n\nBonus\n\n\n* Experience as a service owner of a realtime production system - operating & monitoring services in production, including using observability tooling such as Prometheus, Grafana, Tempo or equivalent offerings and incident management.\n\n* Experience with building applications that can be deployed in cloud, hybrid or on prem environments\n\n* Exposure to Reinforcement Learning\n\n\n\nOur Stack\n\n\n* Languages - (Backend) Python, Go; (Frontend) JavaScript/TypeScript, React; Customer SDK & Clients - C# .NET\n\n* PyTorch\n\n* Cypress\n\n* Docker, Kubernetes, Terraform & Kapitan\n\n* Gitlab CI, ArgoCD, Atlantis, Vercel\n\n* GCP - GKE, PubSub, CloudSQL, BigTable, Postgres, etc.\n\n* Ray.io\n\n* REST & gRPC micro-services\n\n* Poetry, Pantsbuild\n\n\n\nGeneral Interview Process\n\nAll of our interviews are held via Google Meet, and an active camera connection is required.\n\n* Initial screening interview with a People Operations team member (30 minutes): The purpose of this interview is to meet you, learn more about your background, and discuss what you are looking for in a new position.\n\n* Hiring manager interview (30 minutes): The purpose of this meeting is for you to get to know the manager for the role. This chat will mainly focus on your previous experience and technical background. You can expect to talk about projects that you have worked on in the past and ask any questions about the team & role.\n\n* Technical Interview 1 (60 minutes): The purpose of this interview is to assess your skills in Machine Learning and related mathematics.\n\n* Technical Interview 2 (90 minutes): In this interview, we will go over a real world MLOps problem. You can expect to draw architecture diagrams using boxes & arrows in your browser. We will talk about system design, scalability and monitoring.\n\n* Meeting with VP of Engineering (30 minutes): This interview is a combination of technical and cultural fit assessment. You will cover the technical experience and the skills that you brinand have an opportunity to ask any questions about the teamโs culture or vision.\n\n* Culture fit interview with Phaidraโs co-founders (30 minutes): This interview focuses on alignment with Phaidraโs values\n\n\nBase Salary\n\nUS Residents: $115,200-$208,800/year\n\nUK Residents: ยฃ96,400-ยฃ144,000/year\n\nThis position will also include equity.\n\nThese are best faith estimates of the base salary range for this position. Multiple factors such as experience, education, level, and location are taken into account when determining compensation. \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Design, Python, DevOps, Cloud, API, Senior, Engineer and Backend jobs that are similar:\n\n
$65,000 — $110,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nSeattle, Washington, United States
๐ Please reference you found the job on Remote OK, this helps us get more companies to post here, thanks!
When applying for jobs, you should NEVER have to pay to apply. You should also NEVER have to pay to buy equipment which they then pay you back for later. Also never pay for trainings you have to do. Those are scams! NEVER PAY FOR ANYTHING! Posts that link to pages with "how to work online" are also scams. Don't use them or pay for them. Also always verify you're actually talking to the company in the job post and not an imposter. A good idea is to check the domain name for the site/email and see if it's the actual company's main domain name. Scams in remote work are rampant, be careful! Read more to avoid scams. When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.
This job post is closed and the position is probably filled. Please do not apply. Work for Platform.sh and want to re-open this job? Use the edit link in the email when you posted the job!
๐ค Closed by robot after apply link errored w/ code 404 3 years ago
\nPlatform.sh is a groundbreaking hosting and development tool for web applications. We’re a European VC-Backed scaleup with a host of blue-chip Enterprise clients and a string of awards and grants. To reinforce our technical prowess, we are looking to grow our engineering team. If you’re looking for an exciting, high-growth opportunity with an award-winning, cutting-edge company, this could be just the job for you.\n\nWe run dozens of cloud regions all over the world with a mix of clients from individual developers running small development clusters, to the biggest companies on earth that run some of their critical apps on us.\n\nThe company is fully distributed and remote first, with a strong accent on diversity and inclusion in all of its dimensions (gender, sexual orientation, race, country of origin - you have it, we want it). You won’t find any ableism of ageism either. \n\nFor its groundbreaking PaaS solution, https://platform.sh is looking for a Pythonian Cloud Engineer with a taste for Go, good Linux system understanding, and a real hunger for the challenges of building robust, distributed systems.\n\nPlatform.sh is a PaaS shrouded in a lot of black magic (we can consistently clone a whole running cluster, with its state, databases, indexes in a matter of seconds). We want to get this down to the hundreds of milliseconds domain. Interested? There is more...\n\n\n* Our external API is pure Hypermedia REST + oAuth on top of Pyramid. It mechanizes the Git layer and needs more features.\n\n* We can consistently generate from the same manifest a Docker container, an LXC one, or VM disk images (AWS, Azure, Google Cloud, OpenStack).\n\n* We probably have the highest container density in the industry. We need to get it higher.\n\n* We have been working hard on a fast, resilient, and cost-optimized observability framework in order to know how the system behaves, now we want to better predict how it will behave. \n\n* We support any Python, Ruby, NodeJS, PHP, Java, and .NET, Elixir, of course, Elixir, time to roll-out Rust, somebody needs to push that button.\n\n* We need to have more auto-healing on the high-availability clusters. We need more performance out of our multi-protocol ssh proxy. We need work on our Ceph Implementation; we have strictly cool things to do on the Edge. We need… great ideas on how to make Platform.sh even better. Interested? Join us!\n\n\n\n\nThis is a remote position and very occasional travel to cool places like Paris, France, may be required.\n\nSkills & requirements:\n\nRequired:\n\n\n* Be a really really good dev that likes testing, understands how an OS works, knows networking, how git works, and the constraints of a distributed system.\n\n* Be proficient in Python or in Go (expertise in either or both, highly appreciated). But if you are a sufficiently fast learner and got a couple of other languages under your belt (such as Lua, Rust, Erlang, Ruby, or C …), we might bite.\n\n\n\n\nWould be really great if you had:\n\n\n* Experience with C / C++ (we contribute to a bunch of upstream projects, like LXC) is a plus; love of C or C++ not required\n\n* Great knowledge of Git\n\n* Good Networking background (routing/protocols)\n\n* Good grasp of practical security and cryptography\n\n* Experience with other programming languages (e.g. Rust, Haskell, Java, Javascript, Ruby, Common Lisp, PHP)\n\n* Good knowledge of how the Web works (hacking Nginx with Lua a plus). You may want to brush up on HTTP before the interview\n\n* Good understanding of how database systems and search engines work\n\n* A good notion on distributed systems (consensus protocols like Raft/Paxos, eventual consistency models, gossip protocols)\n\n* Mad Debian Skills. Sporting a Debian plaid cloth during the interview is not frowned upon\n\n\n\n\nTo be clear so you are not surprised in the technical interview, this job is very much more for a systems engineer, rather than an application developer. So knowing about system calls is important, while knowing Django, not so much.\n\nA bit about seniority, diplomas, and experience: \n\n\n* We don’t care, at all, about diplomas, you have a Ph.D. in computer science? That is lovely! We love science. You are a self-taught hacker whose main deployment target for years was Arduino? You could very well be a match.\n\n* We have senior juniors and junior seniors. Everybody is. Some of us have been coding for multiple decades. Some of us are fresh out of school. We expect you to have some very strong points. But we know everybody has continents of ignorance; that’s fine. As long as you love learning; you will be surrounded by people who love to share what they have learned.\n\n* Specifically, there is a catch-22 for “seniority requirements” for underrepresented candidates. Try us. We will go the extra mile (more probably a kilometer btw). We will take into account any valuable candidate with less experience in DevOps and System roles if needs be. \n\n\n\n\nA bit about the interview process: \n\n\n* It is usually quite short. Two or three remote interviews. There will be no whiteboarding. Few if any algorithmic questions (unless you love those we would not like to frustrate the preppers).\n\n* The people interviewing you are going to be people you may end up working with. The interviews are going to be a bit “all over the place” with a bunch of detailed questions. The point is less to get the right answer than to give a glimpse of your “technical intellectual world”. You do not remember by heart the flags on a TCP packet? Well, neither do we, but it is a good conversation starter.\n\n\n \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Cloud, Engineer, DevOps, C, Git, Python, API, Travel, Senior, Junior, Nginx and Linux jobs that are similar:\n\n
$70,000 — $120,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
# How do you apply?\n\nThis job post has been closed by the poster, which means they probably have enough applicants now. Please do not apply.
This job post is closed and the position is probably filled. Please do not apply. Work for Platform.sh and want to re-open this job? Use the edit link in the email when you posted the job!
๐ค Closed by robot after apply link errored w/ code 404 3 years ago
\nUs\n\nWe are a product minded innovative and ambitious company thriving in the DevOps cloud industry.\n\nWe are a startup in hyper growth mode and with a global reach. \n\nWe have hired 100 people over the last 24 months, worldwide.\n\nAs we rapidly scale, we need to keep pace with high growth needs for hiring, developing employee performance and learning capacity while providing a differentiated work environment. \n\n\nYou\n\nYou are a technical trainer or tech recruiter with real-world, hands-on technical experience and knowledge of modern cloud architectures and virtualization technologies. \n\nYou are forward thinking, always aiming for enhancements and efficiency. \n\nThis is a role creation that will allow you to combine your passion for teaching and technology. This new role will drive significant impact and have high visibility internally.\n\nYour primary focus will be to offer a great candidate experience:\n\n\n* tailor the job descriptions to convey our company mission and appeal to the type of candidates we want\n\n* build top notch scorecards\n\n* undertake all technical assessments, use critical thinking, active listening and emotional intelligence to assess candidates\n\n\n\n\n\nOn an on-going basis, you will constantly learn about our product new features and innovations to constantly update your onboarding content. \n\nAs you are close to the engineering teams and understand the specificities of each position, you will actively build, customize and deliver technical onboarding content for all our engineering teams (developers, DevOps, architects, product, …) to facilitate and ensure successful experience for newly appointed engineers. \n\nYou will be in charge of developing & maintaining technical training content (screen sharing exercises, presentations, and accompanying materials).\n\n\nKeys to being the right fit for this position\n\nFlexible and capable of progressing quickly in a fast-paced environment\n\nTraining or mentoring experience for SaaS technology products - or self-starter with proven success taking ownership of training projects\n\n3+ years of hands on experience in tech recruitment or training, development, system administration, project management, consulting or training - technical degree or relevant work experience is required\n\nDeep understanding of platform solutions Linux IaaS, SaaS, PaaS, Security, Storage, networking, OSS tools and emerging computing trends\n\nYou don’t need to be a top-notch developer yourself, but you do need to be able to assess a baseline. So we expect you to have a good grasp of at least a couple of programming languages; Python and Go Lang would be great. But we’d really appreciate someone with some C skills, maybe a bit of Rust and potentially a good amount of fondness for functional languages\nExcellent communicator with great interpersonal skills\n\nExcellent written and spoken English \n\nCollaborative attitude\n\nCapacity to work with highly international teams\n\nRigorous and reactive \n\nAutonomous and proactive\n\nYou can be based anywhere (but preferably working Americas’ or EMEA timezone though) \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Education, Teaching, DevOps, C, Cloud, Python, SaaS and Linux jobs that are similar:\n\n
$70,000 — $120,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
# How do you apply?\n\nThis job post has been closed by the poster, which means they probably have enough applicants now. Please do not apply.
This job post is closed and the position is probably filled. Please do not apply. Work for Experian and want to re-open this job? Use the edit link in the email when you posted the job!
\nPotential for this role to be in other cities for the right candidate.\n\nExperian’s Technology team is seeking a talented Senior Cloud Infrastructure Engineer. This role is a hands on technical position responsible for designing and implementing common framework solutions in a cloud based environment. In the world of DevOps, infrastructure and code become one. This person will drive platform enhancements, enable automation/self-service and implement new solutions that enable DevOps capabilities. As part of the greater Technology Operations & Infrastructure team, this person will be responsible for supporting the end to end deployment of platform systems – from virtual networks, to servers, to application services for an e-commerce environment. \n\nResponsibilities: \n\n• Implement & build automation tools such as Jenkins, Puppet, and Python scripting for streamlined deployments & systems updates. \n• Creating monitoring capabilities & alerting for Technical Operations Center (TOC) team members. \n• Design & deploy cloud platform capabilities using AWS (full stack – network, load balancing, DNS, security, databases). \n• Implement infrastructure capabilities in an automated cloud world – such as backups, security tools, IAM, monitoring, etc. \n• Perform advanced technical troubleshooting for cloud & e-commerce environments. \n• Engineer automation tools and the continuous delivery process. \n• Work extensively with continuous integration systems and are able to translate that understanding into workable pipelines and tools. \n• Implement and support a multi-cloud/hybrid solutions for disaster recovery. \n• Enhance system performance and features on a regular basis for enhanced customer experience. \n• Lead technical operations projects from requirements to design to implementation to operations. \n\n• Bachelor’s degree in Computer Science, Engineering or similar field from an accredited four year university required. \n• 5+ years’ experience in a system administrator role. \n• Strong knowledge of the Dev Ops tool chain on Linux/windows platforms; Jenkins, Python/C++/Java, Ansible, Puppet, Confluence, git/tfs, Jasmine, chocolaty, cloud formation, etc. \n• Deploying automation solutions in a public cloud environment such as AWS. \n• Strong communication and collaboration skills across the enterprise. \n• Deployed applications in Amazon AWS, Google or Azure. \n• Migrated applications from on premise data centers to cloud service providers. \n• Proficiency in system design & architecture. \n• Fluency in Java or other object-oriented programming languages. \n\nWhats going on under the hood of Experian?\n\nFor Experian on the CSid side: \n\n\n* We are migrating from Rackspace to AWS: \n\n\n\n* Big visibility, big risk, big complexity, moving from manual to automated code. (WOW)\n\n\n\n* We are looking as we migrate at Cost Optimization in AWS\n\n\n\n* Optimize DBs/EC2s/Storage choices\n\n\n\n* We are looking longer term to go MULTI-CLOUD (Next Gen Cloud)\n\n\n\n* Many complex changes again including breaking our Experian/EPS/CSid stacks down into MicroServices\n\n\n\n\n\n\n Tech Stacks for EPS/CSid\n\n\n* Php + Java + RESTful services\n\n* Python/JSON for AWS automation\n\n* Jenkins, Cloudbees (CI/CD components)\n\n* AppDynamics + Splunk + Nagios + DataDog = monitoring solutions\n\n\n\n\n We are also working on transforming teams into a Site Reliability model as well. \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Cloud, Engineer, DevOps, Amazon, Java and Python jobs that are similar:\n\n
$70,000 — $120,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
# How do you apply?\n\nThis job post has been closed by the poster, which means they probably have enough applicants now. Please do not apply.
This job post is closed and the position is probably filled. Please do not apply. Work for Time Doctor and want to re-open this job? Use the edit link in the email when you posted the job!
\nThis is a full time remote position. All of our current Development Team is in Asia or Europe that’s why we prefer to hire people from Europe and Asia because of team collaboration.\n\nYou will be working mostly on flexible hours although you will need to attend the team meeting which is at approximately 8:30am GMT and work for at least 3 hours after this time. This is 3:30am to 6:30am New York time, so you can see that it is unlikely , although not impossible, that we will hire in North or South America.\n\nYou will be responsible for the deployment and maintenance of a cloud-based multi tenant SaaS solution. To qualify for this job, you must have a minimum of 2 years experience in developing, deploying and maintaining large Amazon AWS based SaaS solutions.\n\nThe role will encompass the use of a broad range of AWS technologies, operating systems (Windows, Linux) and application environments (Nginx, Apache, MySQL, MongoDB, Redis, queue management, Memcache an other open source technologies), understanding of TCP/IP networking with an emphasis on the implementation of best practice cloud security principles.\n\nTop 5 skills needed for this job:\n\n* At least 3 years of Experience with AWS services\n\n* At least 2 years of programming with PHP or Python\n\n* Very good understanding of web app and server security\n\n* Solid experience in building highly scalable server architectures\n\n* Solid experience as a DevOps Engineer in a 24x7 uptime Amazon AWS environment, including automation experience with configuration management tools.\n\n\n\nWHAT YOU WILL BE RESPONSIBLE FOR:\n\n* Deploying, automating, maintaining and managing AWS cloud based production system, to ensure the availability, performance, scalability and security of productions systems.\n\n* Building, releasing and handling configuration management of production systems.\n\n* Doing pre-production Acceptance Testing to help assure the quality of our products / services.\n\n* System troubleshooting and problem solving across platform and application domains.\n\n* Suggesting architecture improvements, recommending process improvements.\n\n* Evaluating new technology options and vendor products.\n\n* Ensuring critical system security through the use of best in class cloud security solutions.\n\n* Supporting installation and maintenance of layered software, and infrastructure.\n\n* Identifying where applications or hardware is having performance/reliability issues; analyzes and formulates a proposed method to correct issues.\n\n* Delivering long-term support and management; troubleshoots and resolves issues daily\n\n* Working in accordance with corporate and organizational security policies and procedures.\n\n* Understanding personal role in safeguarding corporate and client assets.\n\n* Taking appropriate action to prevent and report any compromises of security within scope of role.\n\n* Providing Incident management\n\n* Working on and maintain continuous integration systems\n\n* Working on deploying web applications on various environments\n\n* Debugging and analyzing production load\n\n* Executing penetration tests on production or pre-production environments\n\n* Executing load testing on pre-production or production environments\n\n* Maintaining a system for running automated tests, optimizing and speeding up the execution of test sets\n\n\n\nTO BE THE BEST FIT FOR THIS JOB YOU NEED TO:\n\n* Have at least 2 years of development in PHP, Java or Python and at least 5 years of experience in server management\n\n* Have solid experience :\n\n\n\n* as a DevOps Engineer in a 24x7 uptime Amazon AWS environment, including automation experience with configuration management tools.\n\n* with Ubuntu / CentOS\n\n* in building highly scalable server architectures\n\n* in continuous integration\n\n\n\n* Be an expert in :\n\n\n\n* working with AWS - EC2, RDS, S3\n\n* server security\n\n* setting up and tuning LEMP\n\n* tuning NGINX performance\n\n* setting up back and stability systems\n\n\n\n* Have good experience with :\n\n\n\n* MySQL Replication / Sharding\n\n* setting up and securing wordpress\n\n* managing and installing SSL certificates, configuring firewalls and VPN\n\n* nagios, newrelic or any other monitoring software\n\n* Vagrant and writing provisioning scripts in chef or puppets\n\n\n\n* Have the ability to :\n\n\n\n* write bash scripts\n\n* set up automated deployment of different projects on different environments\n\n* understand complex software architecture\n\n* set up multi-tier architectures\n\n\n\n* Experience with LXD and in setting up MongoDB replication are a plus\n\n\n\nDESIRABLES:\n\n* Bachelor of Computer Sciences or Software Engineering or Informatics\n\n* Great verbal and written communication skills\n\n* Be available 24/7 in a case of need or emergency\n\n* Have really stable and alternative source of internet connection\n\n* Ability to travel around the world for meetings\n\n\n\nTO APPLY, please go to this link- http://time-doctor.breezy.hr/p/6c7a97392bd9-devops-engineer-100--remote-work \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to DevOps, Engineer, Amazon, Java, Cloud, PHP, Python, Travel, SaaS and Nginx jobs that are similar:\n\n
$70,000 — $120,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
# How do you apply?\n\nThis job post has been closed by the poster, which means they probably have enough applicants now. Please do not apply.