Neeladri Shekhar Pal
>Lead Data Scientist · GenAI & Forecasting
About
// who I amOver the past 6+ years, I have led production AI/ML initiatives for Insurance, Petrochemicals, and Financial Services clients across the US, Australia, and the Middle East — translating ambiguous problems into measurable impact.
My recent focus is Generative AI, Agentic AI (LangGraph), and time-series forecasting. I have shipped a conversational claims agent operating at ~99% success across 22,000+ production queries, and a multi-agent fraud detection platform that cut claim adjudication from hours to under 2 minutes. Earlier engagements delivered >95% forecasting accuracy and ~35% process optimization for global clients.
Engineer by education, data whisperer by profession — a firm believer that numbers tell captivating stories. Outside of the data universe, I'm all about finding balance.
Experience
// what I've shipped- Conversational AI for Claims Analytics: Architected and deployed a production LangGraph agent on Cortex Lab that translates natural-language questions into executable code over 7M+ claim activity records from MIMICA in real time.
- Multi-Stage Reasoning Pipeline: Intent triage (Conversational / Simple-Data / Multi-Step), dynamic plan decomposition, code generation, sandboxed execution, correctness verification, and a self-correction loop with up to 5 retries and persistent error memory.
- Code Safety & Guardrails: 6-layer safety framework — prompt injection (instruction override, persona hijack, prompt extraction, delimiter, code injection, PII exfiltration), import blocking, timeouts, output caps, schema/answer caching, S3-persisted trace logs.
- Production Deployment: Gunicorn + Flask + Plotly Dash with async polling-based streaming for real-time response delivery.
- Outcome: ~99% success across 22,000+ production queries over 150 active days (100% in the most recent week).
- Multi-Agent Architecture: LangGraph-orchestrated pipeline of 7 specialized agents for FNOL intake, loss-cause classification, and parallel fraud adjudication across structured, document, visual, and network-graph modalities.
- Hybrid ML + LLM Reasoning: Pre-trained XGBoost classifier wired through a provider-agnostic LLM layer supporting Azure AI Foundry and Amazon Bedrock for flexible model switching.
- Visual & Document Forensics: Image authenticity checks (reverse search, AI-generated detection) and cross-document discrepancy reasoning surface manipulated evidence and inconsistent claim narratives.
- Network Graph Intelligence: Fraud-ring detection via entity overlap (garage, broker, customer, ZIP) visualized as an interactive graph for adjusters.
- Smart Triage & PII: Auto-routing to Auto-Approve / Manual Review / SIU with audit-logged decisions; upstream PII masking via Microsoft Presidio for GDPR/HIPAA-aligned handling.
- Outcome: Reduced claim adjudication from hours to under 2 minutes with explainable, evidence-backed verdicts adjusters can defend in SIU referrals.
- Meta-Modeling Architecture: Multi-layer framework spanning 48 U.S. states and 7 lines of business, ensembling Chronos LLM, TabPFN Timeseries, TimesFM, PatchTST, NBEATS, and Ridge regression on residuals.
- Accuracy & Stability: >95% average accuracy on a 56-month validation window with a rolling 26-week horizon at weekly granularity — consistent across market and seasonal regimes.
- Outcome: Enabled state-level dynamic resource allocation and capacity planning across regions and lines of business.
- Questionnaire Optimization: Analyzed 107 evaluation parameters using Cramér's V and Phi-coefficient association tests to identify redundant and interdependent variables.
- Feature Prioritization: 2-way / 3-way ANOVA against claim outcomes (high SSR vs. SSE) retained categorical questions with high discriminative power.
- Outcome: Cut to 60–70 items per state-specific compliance — ~35% faster policy evaluation, ~14% shorter underwriting lifecycle.
- Cannibalization Detection: Identified product cannibalism and upgrade transitions using cointegration and distributional tests across the SKU portfolio.
- Product Association: Discovered SKU interconnections and built alias mappings from historical purchase patterns via Market Basket Analysis.
- Modeling: Meta-modeling solution combining Linear Dynamical Systems and Deep Auto-Regressive Recurrent Networks, scaled to ~2,600 SKUs across ~49 countries in the MEAF region.
- Leadership: Led the data science squad, owned client comms, and drove the transition to Agile-Waterfall hybrid — improving utilization tracking, velocity estimation, and delivery quality.
- Multivariate Time Series: Deployed SARIMAX models tailored to each sub-product across global sales offices.
- Outcome: ~97% accuracy across 22 sales offices globally for all 3 polyethylene sub-products.
- Performed EDA and pre-processing to build training data from thousands of customer emails.
- Classified claims documents using XGBoost on text from emails, lodgment files, and audio transcripts.
- Built a model-monitoring dashboard in Plotly Dash to track target and feature drift in production.
Founded and scaled an end-to-end algorithmic trading platform serving 9,000+ users across India for quantitative research and systematic trading on Indian and global equity markets.
- Platform: Microservices stack with Python backend on Kubernetes, Redis for caching and order queueing, GitHub Actions for CI/CD, and a React frontend for strategy visualization and live trading.
- Quant Research: Processed multi-year high-frequency equity data to surface tradable signals; tested mean-reversion and momentum hypotheses with ADF, autocorrelation, t-test, and Chi-Square for robustness.
- Leadership: Owned product strategy, architecture, hiring, and client outreach — scaled from concept to 9K+ active users.
Led society administration and authored tutorial blogs on machine learning fundamentals — including K-NN and Market Basket Analysis — to broaden access to applied data science within the campus.