About the Role
We are looking for a versatile Data Scientist to lead our AI initiatives, bridging the gap between classical Machine Learning and cutting-edge Generative AI. You will contribute to the development of StarkVision - our autonomous AI agent platform - and maintain our core predictive models for Churn, Segmentation, Forecasting and others. This is a high-impact role where you will not only build models but also architect complex multi-agent systems that interact directly with our users and databases.
Key Responsibilities
1. Generative AI & Autonomous Orchestration
Agent Orchestration: Design and optimize multi-agent workflows. You will manage specialized agents (Database Admin, Report Agent, Cloud Specialist) to handle complex user requests.
Text to SQL: Enhance our Text-to-SQL capabilities, ensuring accurate translation of natural language into complex SQL queries using LLMs (GPT-4, Gemini, Grok).
RAG & Memory: Maintain and improve our long-term memory systems using Vector Databases (ChromaDB) and RAG pipelines to provide context-aware responses.
Tool Integration: Develop and maintain MCP (Model Context Protocol) tools that allow agents to interact with external services (OneDrive, SQL Databases, PDF parsers).
2. Machine Learning & Predictive Modeling
Churn Prediction: Maintain and improve our deep learning Churn Model built with TensorFlow/Keras (LSTMs). You will handle 3D time-series data construction and model optimization.
Customer Segmentation: Refine our clustering pipelines using Scikit-Learn and Faiss (for GPU-accelerated clustering). Implement advanced feature selection techniques and hybrid clustering approaches (K-Means + Hierarchical)
Forecasting: Manage time-series forecasting modules for sales and demand prediction.
3. Backend & Deployment
Production Engineering: MDeploy models and agents within our Flask application, utilizing Redis and RQ (Redis Queue) for asynchronous background processing.
Data Enngineering: Write efficient SQL queries and Python scripts (Pandas, SQLAlchemy) to preprocess large datasets for both ML training and agent consumption.
4. Other
Company Workflows: Assist in requirements definition, making sure everyone is on the same page.
Requirements
• Bachelor's degree in computer science, Software Engineering, Computer Engineering, or related technical field.
• 3+ years of experience in Data Science or Machine Learning Engineering.
• GenAI Expertise: Proven experience building LLM-based applications. Familiarity with Agentic workflows and RAG architectures is a must.
• Deep Learning: Hands-on experience with Neural Networks, specifically LSTMs/RNNs for time-series data (TensorFlow or PyTorch)
• Strong Math/ Stat Foundation: Deep understanding of clustering algorithms, dimensionality reduction (PCA), and statistical forecasting.
• Coding Skills: Expert-level Python. Comfortable writing production-ready code, unit tests, and working with APIs.
• Database Skills: Strong SQL proficiency. You understand database schemas and can optimize queries.
Nice to Have
• Experience with Kubernetes and Helm Charts.
• Knowledge of Cloud Services.
• Experience with Text-to-SQL fine-tuning or prompt engineering.
• Familiarity with Faiss for large-scale similarity search.
• Knowledge of Model Context Protocol (MCP)
• Knowledge of Agent2Agent Protocols (A2A)
• Ability to handle ambiguity in complex requirements scenarios.
• Experience with on-prem solutions.
What we Offer
• A great environment with real world challenges.
• Opportunity to work on high-tech products integrating Analytics, AI and BI for Banking, Retail and Health.
• A modern technical environment (Modern Python, automated workflows).
• Hybrid work format.