Site Reliability Engineer [SRE] job at FLUIX California, MO, US

Job Description

FLUIX is building the AI operating system that plans, designs, and optimizes AI infrastructure. We are based in Silicon Valley. We specialize in providing AI-driven solutions for data centers and power providers, leveraging cutting-edge Machine Learning (ML) and Artificial Intelligence (AI) technologies. Our mission is to double America’s compute capacity without building new data centers. We are seeking a skilled Site Reliability Engineer to join our growing team. The ideal candidate will help ensure the reliability, scalability, and performance of our hybrid-based (Cloud & On-Prem) platform while supporting our AI/ML infrastructure. You will work closely with our engineering, AI, and operations teams to build and maintain robust systems that support our cutting-edge solutions. Your expertise in ML/AI and experience with data center sites will be crucial in driving the success of our platform. Who you’ll work closely with Founder & CEO Chase Overcash CTO What you’ll do Design, implement, and maintain scalable systems while optimizing performance, ensuring high availability and disaster recovery, and assisting with codebase refactoring for modular deployment. Develop and maintain automation tools to streamline operations, improve efficiency, and automate repetitive tasks to enhance system reliability. Collaborate with engineering and data science teams to integrate ML and AI models into production environments, while ensuring seamless integration and high performance of cutting-edge models within our technology stack. Identify areas for improvement and drive initiatives to enhance system reliability and performance, while staying updated on industry trends and advancements in SRE practices, ML, and AI technologies. Respond to and resolve incidents to minimize impact and ensure timely resolution, while conducting post-incident reviews and implementing improvements to prevent recurrence. Create and manage multiple cloud instances (dev, staging, test), optimize cloud infrastructure and data center operations, and ensure the security and compliance of both infrastructure and applications. Your background Bachelorʼs degree in Computer Science, Engineering, or a related field (or equivalent experience). Proven experience as a Site Reliability Engineer or similar role in a SaaS environment, with a strong background in managing and optimizing cloud infrastructure (AWS preferred, or GCP, Azure), experience with ML and AI technologies, and familiarity with data center operations integrations. Proficiency in programming and scripting languages (e.g., Python), experience with containerization and orchestration tools (Kubernetes), a strong understanding of networking, security, and performance optimization, and knowledge of CI/CD pipelines and DevOps practices. Excellent problem-solving skills with attention to detail, strong communication and collaboration abilities, and the capacity to thrive in a fast-paced, dynamic startup environment. Culture Fit We are looking for obsessed individuals who want to give it their all. We are not afraid to get our hands dirty with physical and software systems. We are eager to visit and work with clients and understand the importance and gravitas of their mission-critical work. We are eager to come into the office and on-site, as our work directly affects physical environments. Due to our mission-critical work, we understand and our eager to help our teammates and co-workers during holidays, weekends, and emergencies. We are cordial and over-communicate with teammates, co-workers, and management. Attractive compensation package, including equity options. Comprehensive health, dental, and vision insurance, along with other standard benefits. A dynamic and collaborative San Francisco Bay Area work environment. Opportunities for professional growth and development, with the chance to shape the future of technology in the industry. #J-18808-Ljbffr FLUIX

Job Tags

Work at office, Weekend work,

Similar Jobs

Beacon Hill

Accounts Receivable Specialist (Temp) - Collections Job at Beacon Hill

...A leading recruitment agency in Los Angeles is seeking a detail-oriented Accounts Receivable Specialist for a temporary assignment. This role focuses on managing collections, resolving account discrepancies, and supporting internal teams to ensure accurate and timely payment...

Procyon TS

Data Software Engineer (NodeJS , JavaScript, ReactJS, RDBMS, Snowflake) Job at Procyon TS

...If you foresee barriers, please let me know.Role Title: Data EngineerProject Name : STARs UCEE Migration# Resources : 1... ...experience with ReactJS 5+ years of experience with RDBMS and Snowflake Experience with asynchronous programming and event-driven architecture...

Senior Helpers

HHA and CNA Job at Senior Helpers

Senior Helpers of Rockledge Caregiver It's easy to go to work when you're making someone's day. Every day. As a Senior Helpers of Rockledge Caregiver, you make a lasting impact that betters the lives of our clients AND their families. Senior Helpers of Rockledge caregivers...

BetaSoft Systems

Senior Business Analyst Oracle JDE & SCM Systems Job at BetaSoft Systems

A leading IT staffing firm is seeking an experienced Business Analyst to provide solutions to various business problems in Westlake Village, CA. Candidates should have over 5 years of experience in applying information systems solutions, with familiarity in systems such...

Little Caesars

District Manager Job at Little Caesars

...with 78 stores operating in North Carolina, South Carolina, Georgia and Virginia. We have an immediate need for a seasoned district manager for this area operations. The position will direct the operations of several restaurants in our Charlotte market. Responsibilities...