Who You Are\n\nWe are looking for a driven Software Engineer (MLOps) to be a part of our growing AI Platform team. You are bold and creative, and have deep empathy for customers who may not be tech-savvy. You will design and implement significant parts of the code base and will have the opportunity to make an immediate impact with your work and guide the product and team as we grow.\n\nYou are curious and like to understand technologies and their tradeoffs in depth - providing technical guidance to the team and peers as and when required. Leading by example, you have accumulated a wealth of insights and experiences from your hands-on involvement in the field, and you are committed to rolling up your sleeves and getting work done. You like joining and supporting other engineers in their work to learn from them as well as letting them benefit from your expertise and experience.\n\nYou have the motivation and skills to identify technical product needs, initiate projects and owning their delivery, including the involvement of engineering peers as needed. You are comfortable with challenging the status quo respectfully to drive and deliver technical excellence in the team.\n\n**We are seeking a team member located within one of the following areas: USA/Canada/UK\nResponsibilities\n\nThe AI Platform team you are joining is responsible for building the core platform that powers model training, inference and decision making in our products. Furthermore the team owns MLOps and the services hosting our AI capabilities. Productionizing results from Research, as well as extending our systems and providing support according to our customer needs fall into team responsibilities as well. You will join this team as an experienced engineer with a focus on MLOps solutions to grow our expertise in that area, but also contribute as a software engineer more widely in the team.\n\nAs an organization, we strongly believe in expertise across the stack. As such, you will experience flavors of Machine Learning, Software Engineering, Distributed Systems, MLOps and DevOps.\n\nIn particular, you will:\n\n\n* Design, build and lead the MLOps initiatives and vision for the AI Platform to strengthen automation, orchestration, versioning, observability, monitoring and collaboration for the platform.\n\n* Build and design scalable components for the AI Platform to allow high throughput training and inference for RL agents doing realtime inference for autonomous control of industrial systems.\n\n* Contribute to the design and implementation of the product backend by writing REST & gRPC API services and scalable event-driven backend applications.\n\n* Design clear, extensible software interfaces for the team's customers and maintain a high release quality bar.\n\n* Design and optimize data storage & retrieval mechanisms for high throughput, security & ease of access.\n\n* Perform DevOps duties of CI/CD, Release & Deployment management.\n\n* Be a part of our global production oncall team and, own & operate your services in production, meeting Phaidraโs high bar for operational excellence.\n\n* Lead cross-functional initiatives collaborating with engineers, product managers and TPM across teams.\n\n* Mentor your peers and be a technical role-model in the team.\n\n\n\nOnboarding\n\nIn your first 30 daysโฆ\n\n\n* You will be immersed in an onboarding program that introduces you to Phaidra and our product.\n\n* You will spend time in the Engineering org, learning how the teams operate, interact, and approach problems.\n\n* You will read various parts of our handbook and familiarize yourself with the documentation culture at Phaidra.\n\n* You will set up your development environment and start working on an onboarding exercise that will introduce you to various parts of our code base.\n\n* You will learn about how we use agile and be able to navigate our sprint boards and backlogs.\n\n* You will learn about various team standards and development & release processes.\n\n* You will start to learn about our system architecture and infrastructure.\n\n* You will start picking up few good โfirst-tasksโ to get yourself accustomed to the end to end release flow.\n\n\n\n\nIn your first 60 daysโฆ\n\n\n* You will get a solid understanding of what Phaidra does and how we do it.\n\n* You will meet with team members across Phaidra and started building relationships that will help you be successful at your job.\n\n* You will complete the onboarding exercise and will be on your way to completing your first production task.\n\n* You will take ownership for the MLOps work on the team, identify gaps and propose roadmap items on the topic.\n\n\n\n\nIn your first 90 daysโฆ\n\n\n* You will be fully integrated in the team and with team members across the company.\n\n* You will have a more in-depth understanding of our system architecture and infrastructure.\n\n* You will complete your first on-call experience helping monitor and improve our production environments.\n\n* You will become an expert with our tooling.\n\n* You will start to contribute to knowledge sharing throughout Phaidra and the team.\n\n* You will take proactively drive MLOps topics in the team and represent it technically throughout the company.\n\n\n\nKey Qualifications\n\n\n* 7+ years of work experience.\n\n* Bachelors or Masters in Computer Science, or equivalent experience.\n\n* Strong experience on designing and implementing MLOps solutions for AI production systems\n\n* Expertise with production Software Engineering - relational and non-relational data modelling, micro-services, understanding of event driven systems, etc.\n\n* Strong experience building large scale multi-tenant systems with high availability, fault tolerance, performance tuning, monitoring, and statistics/metrics collection.\n\n* Strong expertise in Python and Cloud environments\n\n* Good grasp of Machine Learning (especially Deep Learning) fundamentals.\n\n* Ability to collaborate and communicate effectively in an all-remote setting\n\n* Doing your work with curiosity, ownership, transparency & directness, outcome orientation, and customer empathy.\n\n\n\nBonus\n\n\n* Experience as a service owner of a realtime production system - operating & monitoring services in production, including using observability tooling such as Prometheus, Grafana, Tempo or equivalent offerings and incident management.\n\n* Experience with building applications that can be deployed in cloud, hybrid or on prem environments\n\n* Exposure to Reinforcement Learning\n\n\n\nOur Stack\n\n\n* Languages - (Backend) Python, Go; (Frontend) JavaScript/TypeScript, React; Customer SDK & Clients - C# .NET\n\n* PyTorch\n\n* Cypress\n\n* Docker, Kubernetes, Terraform & Kapitan\n\n* Gitlab CI, ArgoCD, Atlantis, Vercel\n\n* GCP - GKE, PubSub, CloudSQL, BigTable, Postgres, etc.\n\n* Ray.io\n\n* REST & gRPC micro-services\n\n* Poetry, Pantsbuild\n\n\n\nGeneral Interview Process\n\nAll of our interviews are held via Google Meet, and an active camera connection is required.\n\n* Initial screening interview with a People Operations team member (30 minutes): The purpose of this interview is to meet you, learn more about your background, and discuss what you are looking for in a new position.\n\n* Hiring manager interview (30 minutes): The purpose of this meeting is for you to get to know the manager for the role. This chat will mainly focus on your previous experience and technical background. You can expect to talk about projects that you have worked on in the past and ask any questions about the team & role.\n\n* Technical Interview 1 (60 minutes): The purpose of this interview is to assess your skills in Machine Learning and related mathematics.\n\n* Technical Interview 2 (90 minutes): In this interview, we will go over a real world MLOps problem. You can expect to draw architecture diagrams using boxes & arrows in your browser. We will talk about system design, scalability and monitoring.\n\n* Meeting with VP of Engineering (30 minutes): This interview is a combination of technical and cultural fit assessment. You will cover the technical experience and the skills that you brinand have an opportunity to ask any questions about the teamโs culture or vision.\n\n* Culture fit interview with Phaidraโs co-founders (30 minutes): This interview focuses on alignment with Phaidraโs values\n\n\nBase Salary\n\nUS Residents: $115,200-$208,800/year\n\nUK Residents: ยฃ96,400-ยฃ144,000/year\n\nThis position will also include equity.\n\nThese are best faith estimates of the base salary range for this position. Multiple factors such as experience, education, level, and location are taken into account when determining compensation. \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to Design, Python, DevOps, Cloud, API, Senior, Engineer and Backend jobs that are similar:\n\n
$65,000 — $110,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
\n\n#Location\nSeattle, Washington, United States
๐ Please reference you found the job on Remote OK, this helps us get more companies to post here, thanks!
When applying for jobs, you should NEVER have to pay to apply. You should also NEVER have to pay to buy equipment which they then pay you back for later. Also never pay for trainings you have to do. Those are scams! NEVER PAY FOR ANYTHING! Posts that link to pages with "how to work online" are also scams. Don't use them or pay for them. Also always verify you're actually talking to the company in the job post and not an imposter. A good idea is to check the domain name for the site/email and see if it's the actual company's main domain name. Scams in remote work are rampant, be careful! Read more to avoid scams. When clicking on the button to apply above, you will leave Remote OK and go to the job application page for that company outside this site. Remote OK accepts no liability or responsibility as a consequence of any reliance upon information on there (external sites) or here.
This job post is closed and the position is probably filled. Please do not apply. Work for Security Scorecard - We are revolutionizing the cybersecurity industry and want to re-open this job? Use the edit link in the email when you posted the job!
\nOpportunity\n\nSecurityScorecard is hiring a DevOps Engineer to bridge the gap between our global development and operational teams who is motivated to help continue automating and scaling our infrastructure. The DevOps Engineer will be responsible for setting up and managing the operation of project development and test environments as well as the software configuration management processes for the entire application development lifecycle. Your role would be to ensure the optimal availability, latency, scalability, and performance of our product platforms. You would also be responsible for automating production operations, promptly notifying backend engineers of platform issues, and checking long term quality metrics.\n\nOur infrastructure is based on AWS with a mix of managed services like RDS, ElastiCache, and SQS, as well as hundreds of EC2 instances managed with Ansible and Terraform. We are actively using three AWS regions, and have equipment in several data centers across the world.\n\nRegions: North America (GMT-7.00) Mountain time - (GMT-4.00) Atlantic time\n\nResponsibilities\n\n\n* Training, mentoring, and lending expertise to coworkers with regards to operational and security best practises. \n\n* Reviewing and providing feedback on GitHub Pull Requests to team members AND development teams- a significant percentage of our Software Engineers have written Terraform.\n\n* Identifying opportunities for technical and process improvement and owning the implementation. \n\n* Championing the concepts of immutable containers, Infrastructure as Code, stateless applications, and software observability throughout the organization.\n\n* Systems performance tuning with a focus on high availability and scalability.\n\n* Building tools to ease the usability and automation of processes\n\n* Keeping products up and operating at full capacity\n\n* Assisting with migration processes as well as backup and replication mechanisms\n\n* Working on a large-scale distributed environment where you were focused on scalability/reliability/performance\n\n* Ensuring proper monitoring / alerting are configured\n\n* Investigating incidents and performance lapses\n\n\n\n\nCome help us with projects such as…\n\n\n* Extending our compute clusters to support low latency, on-demand job execution\n\n* Turning pets into cattle\n\n* Cross region replication of systems and corresponding data to support low latency access\n\n* Rolling out application performance monitoring to existing services, extending integrations where required\n\n* Migration from self hosted ELK to a SaaS stack\n\n* Continuous improvement of CI/CD processes making builds & deployments faster, safer, and more consistent\n\n* Extending a Global VPN WAN to a datacenter with IPSec+BGP\n\n\n\n\nRequirements\n\n\n* 3+ years of DevOps and/or Operations experience in a Linux based environment\n\n* 1+ years of production environment experience with Amazon Web Services (AWS)\n\n* 1+ years using SQL databases (MySQL, Oracle, Postgres)\n\n* Strong scripting abilities (bash/python)\n\n* Strong Experience with CI/CD processes (Jenkins, Ansible) and automated configuration tools (Puppet/Chef/Ansible)\n\n* Experience with container orchestration (AWS ECS, Kubernetes, Marathon/Mesos)\n\n* Ability to work as part of a highly collaborative team\n\n* Understanding of monitoring tools like DataDog\n\n* Strong written and verbal communication skills\n\n\n\n\nNice to Have\n\n\n* You knew exactly what was meant by "Turning pets into cattle"\n\n* Experience working with Kubernetes on bare-metal and/or the AWS Elastic Kubernetes Service.\n\n* Experience with RabbitMQ, MongoDB, or Apache Kafka.\n\n* Experience with Presto or Apache Spark.\n\n* Familiarity with computation orchestration tools such as HTCondor, Apache Airflow, or Argo.\n\n* Understanding of network concepts- OSI layers, firewalls, DNS, split horizon DNS, VPN, routing, BGP, etc.\n\n* A deep understanding of AWS IAM, and how it interacts with S3 buckets.\n\n* Experience with SAFe.\n\n* Strong programming skills in 2+ languages.\n\n\n\n\nTooling We Use\n\n\n* Data definition, format and interfaces\n\n\n\n* Definitions - Protobuf V3\n\n* Normalize from - JSON / XML / CSV\n\n* Normalize to - Protobuf / ORC\n\n* Interfaces - REST API(s) and object store buckets\n\n\n\n* Cloud Services - Amazon Web Services\n\n* Databases: Postgresql, PrestoDB\n\n* Cache: Redis, Varnish\n\n* Languages: Python / C++14 / Scala / Golang / Javascript / Ruby / Java\n\n* Job Orchestration - HTCondor / Apache Airflow / Rundeck\n\n* Analytics - Spark \n\n* Storage: NFS/EFS, AWS S3, HDFS\n\n* Computation - Docker Containers / VMs / Metal / EMR\n\n\n \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to DevOps, InfoSec, Senior, Engineer, JavaScript, Amazon, Python, Scala, Ruby, SaaS, Golang, Apache, Linux and Backend jobs that are similar:\n\n
$70,000 — $120,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
# How do you apply?\n\nThis job post has been closed by the poster, which means they probably have enough applicants now. Please do not apply.
This job post is closed and the position is probably filled. Please do not apply. Work for FineTune Learning and want to re-open this job? Use the edit link in the email when you posted the job!
\nPrefer someone close (-2 to +2 hrs) in East Standard Time zone\n\nFineTune is seeking a strong senior python + devops engineer who will be the expert in residence around\n\n\n* AWS (Google Cloud or any public cloud) infrastructure scaling, monitoring\n\n* python\n\n* sql, nosql and various database optimization\n\n* docker operations and microservices using Kubernetes\n\n* S3, SQS, message queues\n\n* Auto scaling\n\n\n\n\nS/he must have hands on experience working with at least 3 production released software projects/products. We expect him/her to be highly proficient python developer also to also contribute to development of scalable services. S/he must also have hands on experience with docker in production via Kubernetes like orchestration, comfortable with terraform or cloudformation and other cloud automation frameworks. \n\nS/he should be comfortable solving complex system problems while making sure to help increase velocity for software team when building new and updating existing services to scale to support 100k+ users\n\n\nS/he should also be comfortable interacting with stakeholder and provide guidance on the technical feasibility and scope of engineering/rearchitecting needed to solve problems and deliver features. \nDevops engineer will work with engineering to find best ways to increase reliable continuous integration, continuous deployment to enhance software quality and development speed.\n\nS/he will interface with CTO to continuously drive scalability, automation, innovation and new product development while promoting and advancing the scalability and modularization of current platform we are working on with Collegeboard. \n\n#Salary and compensation\n
No salary data published by company so we estimated salary based on similar jobs related to DevOps, Backend, Cloud, NoSQL, Python, Senior and Engineer jobs that are similar:\n\n
$70,000 — $120,000/year\n
\n\n#Benefits\n
๐ฐ 401(k)\n\n๐ Distributed team\n\nโฐ Async\n\n๐ค Vision insurance\n\n๐ฆท Dental insurance\n\n๐ Medical insurance\n\n๐ Unlimited vacation\n\n๐ Paid time off\n\n๐ 4 day workweek\n\n๐ฐ 401k matching\n\n๐ Company retreats\n\n๐ฌ Coworking budget\n\n๐ Learning budget\n\n๐ช Free gym membership\n\n๐ง Mental wellness budget\n\n๐ฅ Home office budget\n\n๐ฅง Pay in crypto\n\n๐ฅธ Pseudonymous\n\n๐ฐ Profit sharing\n\n๐ฐ Equity compensation\n\nโฌ๏ธ No whiteboard interview\n\n๐ No monitoring system\n\n๐ซ No politics at work\n\n๐ We hire old (and young)\n\n
# How do you apply?\n\nThis job post has been closed by the poster, which means they probably have enough applicants now. Please do not apply.