Senior AI Systems / ML Infrastructure Engineer

Emilo Ventures Private Limited · Raipur

₹3,00,000Senior levelOn-siteDirectPosted today

Location

Raipur

Experience

Senior level

Work mode

On-site

Job description

**Job Title: Senior AI Systems / ML Infrastructure EngineerLocation :** Onsite @ EMILO VENTURES PRIVATE LIMITED, Raipur, Chhattisgarh**Experience :** 1+ years**About the Role**

We are building a **scalable, multi-modal AI system** that handles text, image, and audio workloads using a combination of CPU and GPU services.

We are looking for a **Senior AI Systems Engineer** who can design, optimize, and scale AI pipelines — focusing on **performance, cost efficiency, and reliability**.

This role sits at the intersection of:

AI/ML
Backend systems
Distributed architecture
Infrastructure & performance optimization

**Key Responsibilities**

Design and build **scalable AI pipelines** for text, image, and audio processing
Optimize **GPU and CPU utilization** for cost and performance
Implement **batching, queuing, and concurrency control** for high throughput
Architect **CPU vs GPU service separation**
Integrate and manage models such as:
LLMs (Qwen / similar)
Whisper (audio)
Embedding models
Moderation models (toxicity, sentiment, etc.)
Build and manage **event-driven systems (NATS/Kafka)**
Optimize model loading strategies (lazy loading, caching, quantization)
Handle **OOM issues, latency bottlenecks, and scaling challenges**
Design **fallback systems** (CPU fallback when GPU unavailable)
Collaborate with product teams to balance **quality vs cost**

**Required SkillsAI / ML (Practical)**

Experience with Transformers (HuggingFace ecosystem)
Understanding of embeddings, NLP, and basic CV/audio models
Experience deploying models in production (not just training)

**Backend & Systems**

Strong Python (asyncio, multithreading, multiprocessing)
Experience with FastAPI / backend frameworks
Experience with **message queues (NATS / Kafka / RabbitMQ)**

**GPU & Performance Optimization**

Experience working with GPU workloads (CUDA usage basics)
Understanding of:
VRAM management
Batching
Quantization (4-bit / 8-bit)
Model loading/unloading strategies
Ability to debug **OOM and latency issues**

**Infrastructure & DevOps**

Docker (must)
Cloud platforms (AWS / GCP / Azure)
Experience with GPU instances (T4, A10, etc.)
Autoscaling and cost optimization

**System Design**

Microservices architecture
CPU vs GPU workload separation
High-throughput system design
Fault-tolerant distributed systems

**Good to Have**

Experience with **vLLM / TGI**
Experience with **Prometheus / Grafana (monitoring)**
Knowledge of **Kubernetes**
Experience handling **real-time AI systems**

**What We Care About**

Ability to **optimize systems, not just write code**
Strong understanding of **trade-offs (cost vs latency vs quality)**
Real-world experience with **production AI systems**
Ownership mindset and problem-solving ability

**Why Join Us?**

Work on **real-world AI systems at scale**
Solve **challenging performance and cost problems**
Build systems involving **LLMs, audio, and multi-modal AI**
High ownership and impact

Job Types: Full-time, Permanent

Pay: ₹300,000.00 - ₹900,000.00 per year

Work Location: In person

Similar open jobs in Raipur

Customer Care Executive

Terrier Security Services

RaipurDirect

₹10,000FresherFull-time

Posted 2 days ago

View job

Architect

RPR HOMES PVT LTD

RaipurDirect

₹25,000-₹35,000Full-time

Posted 4 days ago

View job

CCM-Raipur

Mahindra & Mahindra Ltd

RaipurDirect

Posted 5 days ago

View job

Trading Assistant

Finvision Institute of Training

RaipurDirect

₹25,000-₹30,000 / month1-3 yearsRemote

Posted 5 days ago

View job

Back to all Raipur jobs