Cloud Operations Engineer - East Windsor, NJ
We have an opening for an experienced, innovative, strategic Cloud Operations Engineer; preferred in our East Windsor, NJ office but open to Boston, MA and New York City as well. Does this sound like you?
The role of the Cloud Operations Engineer is to manage / maintain the delivery of steady state Cloud IT Operational support service for our application environment(s). The COE will help assure MHE’s IT environment is secure, stable, and up to date with all patching and security vulnerabilities. They will manage and maintain all Cloud management tools used to support our Cloud environment. In addition, they will be the conduit between the embedded SRE applications teams, Operations and our vendors assuring all operational actions are executed in compliance with policies and standards.
The Cloud Operations Engineer will work alongside MHE TSDM’s and vendor resources during escalated incidents, this resource helps to troubleshooting on infrastructure and application related outages. They will work closely with the Cloud automation team and Cloud Architecture teams to develop and implement automation of operational tasks to help secure and stabilize the environment while facilitating the business to more easily and quickly deploy applications for our customers.
This role requires in-depth technical support knowledge over a broad range of IT Cloud infrastructure components, tools and expertise in Cloud best practices. The role may also include planning and scheduling activities.
Your contribution to the team includes:
- Tool management and administration. (Puppet, Artifactory, K8’s, ECS, etc.)
- Work alongside MHE TSDM’s andvendor resources during escalated incidents, help to troubleshoot incidents, analyze root cause and provide solutions. Help support and maintain all environment / Prod, QA, QA Live, Dev, Demo and liaise with third-party providers.
- Work with Cloud Automation team and architecture team to develop and implement operational automation for all manual processes. (ie.. application onboarding)
- Manage and direct the support vendor on all operational activities assuring the highest quality of delivery and maintaining a secure and stable environment for the business
- Work side by side with the SRE leads for each application and assure all of their needs are met so they are able to expeditiously deliver the applications they are developing into production in an automated fashion while maintaining the required security and stability of the environment.
- Provide guidance and expertise in relation to System Administration activity. Coordinate and delegate tasks as required, to 3rd party vendors.
- Experience managing environments being hosted by vendors such as Amazon Web Services.
- Help support and maintain all data (production and non-production environments)
- Conduct regular and routine monitoring and management of infrastructure components, ensuring the stable and secure operation and efficient performance.
- Develop and implement policies for Cloud operations support.
- Monitor system capacity to determine its effect on performance.
- Inspect, maintain Cloud CMDB, Service Maps, Discovery of services/instances.
What you'll need to be successful:
- Bachelor degree in business or information systems preferred.
- 5+ years’ experience managing cloud environments, AWS preferred. (EC2 instances, ELB’s, Route 53, Cloud Watch etc.
- 5+ years’ experience scripting with Python and BASH.
- 5+ years’ experience with some of the following Cloud automation tools:
- Kubernetes & supporting technologies (e.g. Etcd, flanneld)
- Puppet / Foreman / Ansible
- Github / Turbot / Stash
- Jenkins / CircleCI, / Artifactory
- Vault / Consul.io / Terraform
- 5+ years’ experience with the following Cloud specific functions:
- Account provisioning
- VPC setup & management
- AMI management and distribution
- Naming / Tagging standards
- Key management and secret standards
- Role Based Access Control implementation and management
- Security experience, understanding PCI and compliance
- Experience preferred with managing vendors supporting hosted infrastructure such as Amazon Web Services. (Datapipe, Rackspace, WiPro)
- Experience with ITSM tools – preferably Service Now and ITIL certification a plus
- Experience creating and documenting quality documentation for process, procedure, policy, standards and best practices.
- Experience in daily Cloud(AWS) operational tasks that would include configurations, communication performance, in a secure, reliable and highly availability environment
- Monitoring and alerting exposure – Datadog, New Relic, Dynatrace, Gomez, Cloud Watch
- Strong communication skills – both written and verbal.
- Self-motivated, strong team player, and independently minded
- Additional desired skills:
- Experience integrating software and services using common API techniques (SOAP, REST, XML)
- Complete oversight of the Cloud infrastructure (AWS) environments.
- Knowledge of leading technologies, market trends and technical solutions.
- Ability to manage third party vendors and adapt to other people’s view.
Why work for McGraw-Hill Education? We make your life easier and better, and you, smarter. Click here to learn more!
North America-United States-New York-New York City, North America-United States-Massachusetts-Boston