Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.
Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.
A BS or MS in Computer Science, or equivalent. Identifies and implements complex solutions to knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance. Experience running large scale customer facing web services. Identifies and implements complex solutions to understanding of load balancing technologies and experience with development in programming languages, databases and big data stores, and container technologies. Work involves defining and documenting technical architecture of complex and highly scalable products. A minimum of 8 years experience of running large scale customer facing web services.
Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans status or any other characteristic protected by law.
OracleHospitality Cloud Service Reliability Engineering
LOCATION: COLUMBIA, MD PREFERRED BUT REMOTE WILL BE CONSIDERED
NOTE:We are unable to provide visa sponsorship for this role at this time - nocandidates requiring visa sponsorship will be considered.
TheHospitality Cloud DevOps team is a newly formed group within Oracle Hospitalityfocused on maximizing our service reliability and availability while working inparallel to evolve applications from SaaS to True Cloud Native Solutions. Astart-up like approach with a big charter and room for creative freedom. We arelooking to assemble some of the smartest people in the industry to build andgrow this revolutionary and disruptive team.
About The Job
Aunique opportunity to join this newly formed team to engineer cutting edgeOracle Cloud technologies and infrastructure that make up the OracleHospitality Cloud solutions. As part of the SRE team, you will be continuallychallenged and have an opportunity to contribute to the Oracle Cloud successevery day, working closely with the development and Infrastructure partners.
Asan SRE, you will solve interesting technical challenges by defining, designingdeploying and troubleshooting key Oracle Cloud services, platforms, andinfrastructure, always thinking about reliability, scalability, resilience,security, and performance.
Inthis role, you will be responsible for the following:
Service OwnershipYouwill be part of the SRE team, whose mission is the shared full stack ownershipof a collection of services and/or technology areas, with our Developmentpartners.
Ownership ScopeAs an SRE, you will understand the end-to-end configuration, technicaldependencies, and overall behavioral characteristics of the productionservices you own. In partnership with your Development partners, you will havethe responsibility to ensure that services are designed, delivered and deployedto be mission critical with focus on security, resiliency, scale, andperformance. SREs are accountable for the end-to-end performance andoperability of the services they own.
Service DesignAs the Oracle Cloud evolves; you will partner with development teams indefining and implementing improvements in service architecture, both currentand future. As an SRE, you will be an expert at articulating technicalcharacteristics of your services and the dependencies between services, and guideDevelopment teams to engineer and add premier capabilities to the Oracle Cloudservice portfolio.
Operations EngineeringYou will understand and be able to communicate the scale, capacity, security,performance attributes and requirements of the services you own. To understandand communicate every characteristic of their service stack, such as:
degradation and behavior under loadof the services and their dependencies
end-to-end tuning needs, optimizingresource utilization, as load patterns fluctuate
Instrumentation and metrics thatclearly describe the service behaviors
scaling requirements and patterns
resiliency and recoverability,ensuring that backup / restore and disaster recovery capabilities areimplemented, tested and maintained
AutomationYou will have a clear understanding of automation and orchestration principles,and will be eager to automate, wherever and whenever the possibility arises,while simultaneously eliminating technical debt. Automation must be part ofyour DNA.
Broad Interests- SREs are a rare mix of sysadmins and development Engineers, and as such havethe ability to understand and explain the effect of product architecturedecisions on the ability to run as distributed systems. They are driven byprofessional curiosity and a desire to develop deep understanding of theirservices and the technologies they depend upon.
Passionto ensure that we don t live in hope that systems are operational andavailable, you know because of your work, that your systems are resilient, reliable,secure and serving our customers.
Ideal Qualification/ Experience
BS or MS in Computer Science, orequivalent
5 year experience of running largescale customer facing web services / service development / DevOps / SRE
Server hardware configuration /Linuxinternals and System administration is a must have
Oracle Fusion Middleware Productsexperience mandatory Oracle WebLogic Server, Oracle Access Manager (OAM), Oracle IdentityManager (OIM) and Oracle Internet Directory (OID)
Advanced Oracle Enterprise Manager(OEM) experience
Experience working with Oracle CloudInfrastructure (OCI)
Experience withautomation/configuration management using either Puppet/Chef or an equivalent particularlywith Oracle Fusion Middleware applications and associated operations
System and Application Performance Measurementand Evaluation experience
System / Application architectureexperience in designing and implementing High Availability for SaaSapplications.
Methodical approach to troubleshootingcomplex problems
Development / Debugging in languages,such as C, C ,Java,Python, Go,Perl or Ruby
Oracle Database and big data stores
Container technologies, such as Dockerand Kubernetes
Defining and documenting technicalarchitecture of complex and highly scalable products
Most importantly, the aptitude to be agood team player and the willingness to learn and implement new Cloudtechnologies
Detailed Description and JobRequirements
Design,develop, troubleshoot and debug software programs for databases, applications,tools, networks etc.
Asa member of the SRE division, you will take an active role in the definitionand evolution of standard practices and procedures. You will be responsible fordefining and developing software for tasks associated with the developing,designing and debugging of software applications or operating systems.
Workis non-routine and very complex, involving the application of advancedtechnical/business skills in area of specialization. Leading contributorindividually and as a team member, providing direction and mentoring to others.BS or MS degree or equivalent experience relevant to functional area. 5 yearsof software engineering or related experience.
Title:Site Reliability Developer - REMOTE
Loading some great jobs for you...