$whoami --verbose

ML Engineer · MLOps · Köln, Germany

ML in production.
Trained, deployed, observed — shipped.

Machine Learning Engineer focused on MLOps and DevOps for AI — building scalable pipelines, cloud infrastructure, and production-grade LLM / multimodal systems. Research on the side.

See selected work →Get in touch

🇩🇪Köln, Germany

Years in production ML

225+

Scholar citations

AI products shipped

30+

Freelance deliveries

portrait.jpg

Md Shohanur Islam Sobuj

 Köln, Germany

Open to talks

ML Engineer · MLOps · LLMs · Köln

Shohanur Islam SobujML.Engineer ↗

◆Now: building multimodal AI at Anymate Me◆research · LLM-Mixer◆MLOps · multimodal◆Responds in 24h · CET◆Now: building multimodal AI at Anymate Me◆research · LLM-Mixer◆MLOps · multimodal◆Responds in 24h · CET

01Selected work

Production systems, published research — and the blur between them.

01Production
Anymate Me GmbH · 2024–Present

Agentic Slide-to-Video Engine

Prompt / PDF / PPTX → AI agent → lip-sync avatar video

Full multimodal pipeline: user provides a prompt, uploads a PDF, or drops a PPTX — a RAG-backed AI agent intelligently rewrites and structures the slide content, then drives a speech-synthesis + avatar lip-sync system to render the final video. Also engineered a standalone PPTX content agent: give it a natural-language instruction and it edits text, layout, or data on any slide.

Prompt / PDF / PPTX inputRAG-backed slide rewritingAvatar lip-sync at scale

MLOpsMultimodalRAGAgentGCPPyTorch

02Production
Anchorblock Technology · 2022–23

Finix

AI-powered banking assistant

Conversational AI banking platform with voice interface, transaction intelligence, and personalised financial insights. Shipped to Google Play with Docker + GitHub Actions CI/CD.

Google Play shippedVoice + text interface

Conversational AIFastAPIDockerCI/CD

03Production
Business Automation Ltd · 2023–24

CDC Data Pipeline

Real-time change-data-capture for document intelligence

Production Kafka/Debezium CDC pipeline with downstream OCR validation and NLP analyzers processing business documents at scale.

Real-time CDCOCR + NLP downstream

KafkaDebeziumOCRNLPPython

04Research
↗ NeurIPS Workshop 2025

LLM-Mixer

Multiscale mixing in LLMs for time-series forecasting

Treats time-series patches as language tokens; multiscale mixing on a frozen LLM backbone achieves SOTA on 4 of 7 benchmarks with 30% fewer FLOPs.

SOTA on 4/7 datasets30% fewer FLOPs

Time-seriesLLMPEFTPyTorch

05Research
↗ QPAIN 2025 · IEEE

OCR Tamper-Proof Signing

Document integrity verification

OCR-enhanced digital signature system that detects in-place document tampering with 99.4% accuracy at sub-second verification speed. Published at IEEE QPAIN 2025.

99.4% tamper detectionSub-second verify

OCRSecurityIEEE

06Research
ICLR 2024 · Tiny Papers

L-TUNING

Synchronized label tuning for LLMs

Novel prompt/prefix-tuning method with synchronized label optimization — +3.7% on GLUE with 92% fewer trainable parameters than full fine-tuning.

+3.7% acc on GLUE92% fewer params

PEFTLLMsNLPPyTorch

02Experience

Six years.
Four chapters.

Anymate Me GmbH

Machine Learning Engineer

Dec 2024 — PresentNOWKöln, GermanyFull-Time

—Developing and deploying comprehensive MLOps pipelines for PPTX→video generation at scale.
—Building multimodal stack: speech synthesis, avatar lip-sync, slide intelligence, RAG.
—Owning end-to-end model lifecycle from training through canary deploys to production.

PythonPyTorchMLOpsGCPDocker

Business Automation Ltd.

Machine Learning Engineer

Nov 2023 — Oct 2024Dhaka, BangladeshFull-Time

—Architected CDC pipelines (MySQL · Debezium · Kafka · Zookeeper) feeding live ML services.
—Shipped SmartRemarks — an intelligent text-analysis system for internal operations.
—Delivered a high-accuracy OCR service for TIN-certificate validation (>99% precision).
—Built and maintained scalable ML microservices infrastructure.

KafkaDebeziumMySQLOCRNLPDocker

Anchorblock Technology LLC

Machine Learning Engineer

May 2022 — Oct 2023Dhaka, BangladeshFull-Time

—Led design, development and delivery of 13 AI/ML products across conversational AI, NLP, CV and FinTech.
—Architected LLM-powered enterprise knowledge bots using advanced RAG techniques.
—Configured and managed scalable AWS infra (EC2, ECS, S3) and CI/CD via GitHub Actions.

LLMsRAGAmazon LexAWSFastAPICI/CDDocker

Self-Employed

Freelance ML Engineer

Jan 2019 — Apr 2022RemoteFreelance

—Partnered with international clients to design and deploy custom ML solutions.
—Delivered 30+ projects across computer vision, NLP and predictive analytics.
—Built production systems for image classification, real-time object detection and ASR.

PyTorchTensorFlowOpenCVFastAPIAWSDocker

03Research & publications

Selected papers,
research on the side.

Google Scholar ↗

2025NeurIPS Workshop

LLM-Mixer: Multiscale Mixing in LLMs for Time Series Forecasting

15 citations

PDF ↗Code ↗

2025arXiv

PEFT A2Z: Parameter-Efficient Fine-Tuning Survey for Large Language and Vision Models

14 citations

PDF ↗

2025QPAIN

OCR-Enhanced Digital Signatures for Tamper-Proof Document Integrity Verification

PDF ↗

2024IEEE Access

Securing Electric Vehicle Performance: Machine Learning-Driven Fault Detection and Classification

60 citations

PDF ↗

2024Scientific Reports

Parameter-Efficient Fine-Tuning of Large Language Models Using Semantic Knowledge Tuning

36 citations

PDF ↗

2024iCACCESS

Leveraging Pre-trained CNNs for Efficient Feature Extraction in Rice Leaf Disease Classification

11 citations

PDF ↗

2024arXiv / ICLR

L-TUNING: Synchronized Label Tuning for Prompt and Prefix Tuning in LLMs

8 citations

PDF ↗Code ↗

2023EMNLP Workshop

Contrastive Learning for Universal Zero-Shot NLI with Cross-Lingual Sentence Embeddings

3 citations

PDF ↗

2022Applied Sciences

An Enhanced Neural Word Embedding Model for Transfer Learning

43 citations

PDF ↗

2021ICECIT

A Classical Approach to Handcrafted Feature Extraction for Bangla Handwritten Digit Recognition

23 citations

PDF ↗

2021ICIRCA

BanglaLM: Data Mining Based Bangla Corpus for Language Model Research

9 citations

PDF ↗

2021Res. Comput. Lang.

An Efficient Approach on Sentiment Analysis of Bangla Social Media Data Using FastText

2 citations

PDF ↗

04Stack

The graph I think in.

Research up top, engineering in the middle, cloud underneath. The fun is in the edges.

serving

DockerKubernetesFastAPIKafkaMLflowDebezium

cloud

AWS · EC2/ECS/S3GCPGitHub ActionsTerraform

core

PythonPyTorchTensorFlowTransformers

domains

LLMs / RAGNLPComputer VisionOCRTime-series

skills.graph19 nodes · 26 edges

05ai studio · 2025–26

What I'm
shipping now.

Three active workstreams — agentic systems, generative UGC pipelines, and workflow automation. All in production, all instrumented with evals and cost telemetry.

agentic

Chatbot Agents

Tool-using LLMs in prod.

Multi-step agents with RAG, function-calling, structured outputs and refusal logic. LangGraph + custom eval gates before every release.

<400ms

p95 first token

94%

tool routing acc.

LangGraphClaudeOpenAIpgvectorRedis

ugc

AI UGC Pipeline

Script → avatar → video.

PPTX / PDF / prompt to narrated video: script gen, multilingual TTS, lip-sync avatars, branded templates. Live at Anymate Me.

60+

languages

5min

to first video

DiffusionTTSFFmpegWhisperS3

automation

Workflow Automation

Agents that do the work.

Doc classification, OCR validation, routing — built as idempotent event workers with DLQs, retry logic and full audit trails.

99.2%

OCR accuracy

-68%

manual reviews

KafkaDebeziumTemporalFastAPIPostgres

06case study

Production · Multimodal AI

From prompt to
lip-sync video.

A user uploads a PDF, PPTX, or types a prompt. A RAG-backed content agent parses the input, rewrites and structures slide content, then hands off to a TTS engine for voiceover. An avatar renderer synchronises lip movement and gesture, and the final video is encoded and delivered — all on a managed GCP pipeline with quality gates at every stage.

< 3 min

p95 end-to-end (prompt → video)

96.2%

slide groundedness (RAG eval)

99.4%

pipeline uptime (GCP)

< 4%

content rewrite error rate

PyTorchRAGTTSAvatar Lip-syncFastAPIGCPMLflowDocker

fig. 1 — slide-to-video pipelinelive

07how I work

How I ship
AI in 2026.

01Evals first

Golden set before a line of code runs.

LLM-as-judge + human spot-checks. Catch regressions before prod, not after a customer complaint.

02Build

Prompt → RAG → PEFT — in that order.

Pull the cheapest lever first. LoRA or QLoRA when the task demands it, agents and tool-calling only where they carry real weight.

03Ship

Reproducible, observable, rollback-able.

Docker, MLflow, CI/CD on GCP. Token-level tracing, latency SLOs, canary before blue/green. Cost is a first-class metric.

04Tune

Smaller, faster, grounded.

PEFT over full fine-tuning. Quantization, prompt caching, guardrails. Red-team evals and continuous A/B — never ship and forget.

08Playground

Rather ask than read?
There's a small model for that.

Ask the model anything · grounded on cv.json

Hi — I'm a small model grounded on Shohanur's CV. Ask me anything.

A small chat agent grounded on Shohanur's CV. Try: "What did you build at Anchorblock?"

10Contact

Let's talk.
Research, production, or both.

Always happy to talk ML, research, or interesting engineering problems. Based in Köln — working across DE & EU.

email[email protected] ↗githubgithub.com/shohanursobuj ↗linkedinin/shohanursobuj ↗scholarGoogle Scholar ↗

Open to conversations

// send a message

ML in production.Trained, deployed, observed — shipped.

Six years.Four chapters.

Selected papers,research on the side.

The graph I think in.

What I'mshipping now.

From prompt to lip-sync video.

How I ship AI in 2026.

Rather ask than read?There's a small model for that.

Let's talk.Research, production, or both.

ML in production.
Trained, deployed, observed — shipped.

Six years.
Four chapters.

Selected papers,
research on the side.

What I'm
shipping now.

From prompt to
lip-sync video.

How I ship
AI in 2026.

Rather ask than read?
There's a small model for that.

Let's talk.
Research, production, or both.