Join our innovative team as a Linux Engineer, where you will be integral in automating and optimizing our Linux server infrastructure. At IMC, the Linux Engineering team plays a crucial role in managing the provisioning, configuration, and performance of a large-scale, mission-critical Linux server fleet. In this role, you will harness cutting-edge automation and self-service tools to ensure our servers are stable, reliable, and scalable, meeting the needs of a rapidly evolving industry. Your forward-thinking approach and dedication to continuous improvement will help us maintain our leadership position by integrating the latest technologies and methodologies.
Key Responsibilities:
- Utilize advanced tools and techniques to troubleshoot and resolve complex issues on enterprise Linux systems, ensuring the stability and performance of our key trading and development environments.
- Enhance and maintain configuration management code and automated processes for over 7,500 critical Linux systems in a near 24/7 High-Frequency Trading (HFT), Ultra Low Latency environment.
- Apply your Python expertise to design, develop, and maintain processes that manage and sustain critical Linux systems at scale in a diverse and technically complex setting.
- Improve and support existing programs and processes that provision bare-metal servers, transforming them into fully operational Linux trading and development platforms.
- Support and refine our metrics and log collection infrastructure, along with our core monitoring and alerting tools, to ensure comprehensive system visibility.
- Communicate status updates, ideas, and strategies effectively with peers and stakeholders through various channels, including chats, face-to-face interactions, issue tracking systems, clear commit messages, and well-documented merge requests.
Qualifications:
- Bachelor’s Degree in Computer Engineering or a related field.
- 5+ years of experience in Linux engineering, debugging, administration, and OS system provisioning (PXE/DHCP/TFTP/Grub).
- Extensive experience with large-scale configuration management, preferably using Puppet and Hiera.
- Proficient in Docker image building, modification, and publishing.
- Hands-on experience with Kubernetes.
- Advanced Python skills for automation, API programming, design, unit testing, and debugging.
- Proven experience in designing Ansible tasks and playbooks, and utilizing Ansible Tower.
- Expertise in RPM design, build, publishing, and repository management.
- Familiarity with CI/CD pipelines, version control systems (Git), and best practices for branching and merging.
- Proficiency in a range of system and network tools and services, including EBPF, tcpdump, strace, nmcli (Network Manager), systemd, ntp/ptp, lsof, nc, nmap, and NFS/S3 storage.
- Strong understanding of networking fundamentals, including DNS, TCP/UDP, and multicast.
- Experience with monitoring tools such as Prometheus/Grafana, Alert Manager, Alerta, and OpsGenie.