sarthak biswas

open for work

ai engineer

(open for work)

ai engineer

i build what comes after pretraining — fine-tuning pipelines, rl environments, reward systems, and production ml that ships.

view projects

writes

Breaking the NN Ceiling — LightGBM, FT-Transformer, and Why Ensembles Work

2026-05-09

The neural net was stuck at 264s. Diagnostics said rare pairs were the problem. The fix wasn't a better NN — it was two completely different models and an ensemble that cut MAE to 253s.

From 351s to 253s — Feature Engineering, Neural Nets, and Why Statistics Beat XGBoost

2026-05-06

Zone-pair median — a dictionary lookup — beats XGBoost with zero ML. Here's how I built 26 features, iterated through 4 neural net versions, and hit a ceiling that no architecture change could break.

From 0.300 to 0.537 — 15 Experiments Training an LLM Stock Trader with SFT and GRPO

2026-05-04

15 models. 4 disasters. A complete experiment log of training a 7B LLM to trade stocks using supervised fine-tuning and Group Relative Policy Optimization — what worked, what broke, and why.

Building an RL Environment That Actually Works — Rewards, World Models, and Why Environments Are Harder Than Training

2026-04-26

Every shortcut I left in the environment, the agent found and exploited. Here's how I designed observations, rewards, grading, and a neural world model for training LLM trading agents on real Indian equity data.

experience

doubtflix (aiprep)

ai engineer

nov 2025 - apr 2026(full time, remote)

ai video generation pipeline (0 to 1), llm orchestration, model fine-tuning, rag search, and recommendation systems for an edtech platform.

video generation pipeline

ai video generation pipeline (0 to 1) using manim + elevenlabs tts, multilingual narration across 14+ languages.

llm orchestration

multi-stage llm pipeline with model routing, tool calling, and async task execution.

dataset engineering & finetuning

sft/dpo data pipelines using unsloth ai, curated datasets across 13 subjects for domain-specific fine-tuning.

rag & search

rag-powered search and production apis, contextual retrieval across educational content.

recommendation system

personalized recommendation engine using collaborative filtering, vector embeddings, weighted ranking with cold-start handling.

fastapipostgresqldockers3hugging faceunsloth ai

exploring

browser world model

architecture design

causal transformer world model for browser environments — predicting dom state transitions from agent actions, enabling offline rl training for web agents.

specification gaming detection

research & experiment design

formalizing a 3-class taxonomy of llm agent specification gaming (reward-free inaction, proxy gaming, kl catastrophe) and building a world-model-grounded detector that flags gaming during grpo training.

projects

/ DATE

/ PROJECT

/ TYPE

april, 2026

llm stock trader, fine-tuning & rl alignment

ml/fine-tuning

april, 2026

stock trader rl environment

rl/ml

may, 2026

eta prediction engine

ml/deep learning

march, 2026

autonomous trader agent

ml/quant

may, 2026

rag system

ml/nlp

llm stock trader, fine-tuning & rl alignmentapril, 2026

stock trader rl environmentapril, 2026

eta prediction enginemay, 2026

autonomous trader agentmarch, 2026

rag systemmay, 2026

skills

/ CATEGORY

/ TOOLS

ai / ml

PyTorchHF TransformersScikit-learnUnsloth AIvLLM

techniques

LLM Fine-tuningReinforcement LearningDeep LearningRAGFeature EngineeringData Pipelines

data / libraries

PandasNumPyOpenCVNLTKMatplotlib

backend / databases

FastAPICeleryRedisPostgreSQLVector Databases

cloud / devops / mlops

AWS/GCPDigitalOceanDockerCI/CDGitHub Actions

languages & tools

PythonTypeScriptGitLinuxVim

/ AI / ML

PyTorchHF TransformersScikit-learnUnsloth AIvLLM

/ TECHNIQUES

LLM Fine-tuningReinforcement LearningDeep LearningRAGFeature EngineeringData Pipelines

/ DATA / LIBRARIES

PandasNumPyOpenCVNLTKMatplotlib

/ BACKEND / DATABASES

FastAPICeleryRedisPostgreSQLVector Databases

/ CLOUD / DEVOPS / MLOPS

AWS/GCPDigitalOceanDockerCI/CDGitHub Actions

/ LANGUAGES & TOOLS

PythonTypeScriptGitLinuxVim