Skip to content
View debu-sinha's full-sized avatar

Block or report debu-sinha

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
debu-sinha/README.md

Debu Sinha

Lead Specialist Solutions Architect (Applied AI and ML) @ Databricks. I work on the layer between LLM evaluation infrastructure and the agent frameworks built on top: tracing, scorers, judges, retry semantics, safety hardening.

Reviewer for NeurIPS 2026 (main track, Evaluations & Datasets, Position Papers). IEEE Senior Member.

What I'm working on

SycoBench-600: Measuring Sycophancy and Correction Selectivity in LLM Assistants - ACL 2026 Findings. Introduces correction selectivity as a separate evaluation axis from sycophancy. Code, dataset, and per-model results at sycobench-600.

OTel observability finish for omnigent (3 merged, 5 open): GenAI semconv span attributes, W3C TRACEPARENT subprocess propagation, GenAI metric instruments, retry events on the production async path, end-to-end OTLP receiver test, plus the canonical Databricks integration guide.

Publications

Practical Machine Learning on Databricks

Practical Machine Learning on Databricks (Packt, 2023)

SycoBench-600: Measuring Sycophancy and Correction Selectivity in LLM Assistants (ACL 2026 Findings)

Learning to Translate with Products of Novices (TACL, 2013)


Writing on mlflow.org


Full OSS contributions · Speaking and workshops · ·

Pinned Loading

  1. Databricks-GenAI-Series Databricks-GenAI-Series Public

    All the resources related to GenAI hands on workshop.

    Python 23 49

  2. agentsec agentsec Public

    Security scanner and hardener for agentic AI installations - OpenClaw, MCP servers, and AI agent skill ecosystems

    Python 7 2

  3. mlflow-modal-deploy mlflow-modal-deploy Public

    MLflow deployment plugin for Modal serverless GPU infrastructure

    Python 3 1

  4. inspect_ai inspect_ai Public

    Forked from UKGovernmentBEIS/inspect_ai

    Inspect: A framework for large language model evaluations

    Python 1

  5. mlflow mlflow Public

    Forked from mlflow/mlflow

    The open source developer platform to build AI agents and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.

    Python 1

  6. techfutures-2025-mlops-databricks techfutures-2025-mlops-databricks Public

    End-to-end MLOps workshop on Databricks — learn how to train, track, register, and deploy ML models using PyTorch, MLflow, and Unity Catalog, with extensions to LLMOps via Mosaic AI.

    Python 2 1