Tae Hyun Kim (Lowell)

← All pillars

Personalization

Personalization

HTE · targeting · recommendation · pricing

The through-line — individualized clinical treatment decisions and industry targeting, recommendation, and pricing as two sides of one methodological core.

20 notes

Dunnhumby — Track 1: Latent-Factor Customer Segmentation

NMF latent factors (92.44% explained variance) + K-Means yield 7 stable behavioral segments (Bootstrap ARI 0.77) with per-segment marketing actions. Illustrative case study on the public Dunnhumby retail dataset.

2026-06-13 #targeting#segmentation
Dunnhumby — Track 2: Causal Targeting via Heterogeneous Treatment Effects

Meta-learner / Causal Forest CATE under severe positivity violation (PS AUC 0.989); an OPE-validated policy targets ~31% of customers and surfaces counter-intuitive negative-CATE segments. Hypothesis-generating on public data.

2026-06-13 #causal-inference#targeting#uplift
Applied Causal Inference for Pricing — CATE & SCM Across Public Datasets

An applied case study using only public datasets (LendingClub, iPinYou) that combines CATE estimation for price-sensitivity heterogeneity with SCM-based moderator analysis to design individual-level, risk-based pricing and RTB bidding policies — all findings illustrative and projected, not proprietary.

2026-06-12 #pricing#causal-inference#cate
Customer Segmentation

Customer Segmentation is the unsupervised task of partitioning customers into a finite set of segments by similarity in behavior, value, and preference. A common recipe is latent-factor decomposition followed by clustering: behavioral features → NMF (non-negative, parts-based decomposition) → factor scores → K-Means → segments.

2026-06-12 #targeting#segmentation
Customer Segmentation & Causal Targeting — An Applied Case Study

An end-to-end applied case study on the public Dunnhumby dataset — NMF latent factors and K-Means segmentation feeding meta-learner / Causal Forest HTE and an OPE-validated optimal targeting policy, with a candid look at positivity violation and counter-intuitive "sleeping dog" segments.

2026-06-12 #targeting#segmentation#uplift
From Estimation to Action — How HTE Drives Personalized Policy Across Domains

One methodological spine — estimate heterogeneous treatment effects and turn them into individual-level policies — powers both clinical sequential treatment decisions and industrial targeting, pricing, and recommendation.

2026-06-12 #personalization#causal-inference#decision-making
LLM Multi-Layer Attribute Extraction for Cross-Domain Recommendation

A case study on extracting a 3-layer attribute taxonomy (product / perceptual / theory-grounded) with LLM/VLM pipelines, turning it into user profiles and a mixture-of-experts adaptor, and plugging it into standard recommenders across two public domains (fashion + music).

2026-06-12 #recsys#llm4rec#cross-domain
Optimal Targeting Policy

An Optimal Targeting Policy maps covariates $x$ to a treatment decision $\pi(x)\in\{0,1\}$ so as to maximize policy value:

2026-06-12 #targeting#policy-targeting#doubly-robust
RTB Bidding Strategy via Causal ML — From Prediction to Optimization

A five-stage case study on the public iPinYou RTB dataset that moves from pCTR/pCVR prediction through causal effect estimation (CATE, SCM) to budget-constrained optimal bidding and off-policy policy evaluation.

2026-06-12 #decision-making#targeting#ope
Targeting & Profiling Overview

Targeting & Profiling is the industrial face of personalization. The same methodological core (heterogeneous effect estimation → individual-level optimal policy; MOC-Personalization) appears in clinical settings as "optimal treatment assignment per patient," and in industry as "optimal campaign/exposure assignment per customer." This domain answers who…

2026-06-12 #targeting#policy-targeting
Uplift Modeling

Uplift is the causal increment that a treatment (campaign exposure, coupon, recommendation) induces in an individual's outcome (purchase, conversion). For binary treatment $W\in\{0,1\}$, outcome $Y$, and covariates $X$.

2026-06-12 #targeting#uplift#meta-learner
User Profiling

User Profiling is the task of inferring a personal preference profile (taste, context, latent patterns) from a customer's behavioral history and representing it as a vector. It is the shared input layer for targeting, segmentation, and recommendation — the industry-side counterpart to patient covariate/multimodal representations (Multimodal Clinical Data) in the clinical domain.

2026-06-12 #targeting#profiling
Dynamic Treatment Regimes (DTR / OTR)

A DTR is a sequence of decision rules $\{d_t(H_t)\}_{t=1}^T$ mapping the accumulated history $H_t$ (covariates, prior treatments, intermediate outcomes) to a treatment. The optimal treatment regime (OTR) maximizes the expected long-term outcome $E[Y^{d}]$. Estimation:

2026-06-11 #decision-making#clinical-decision-making#dtr
ESCM² (Entire Space Counterfactual Multi-Task Model)

A model that integrates a counterfactual risk regularizer based on the Inverse Propensity Score (IPS) and the Doubly Robust estimator into ESMM, in order to address ESMM's two theoretical limitations — Inherent Estimation Bias (IEB) and Potential Independence Priority (PIP).

2026-03-25 #recsys#causal-inference#doubly-robust
ESMM (Entire Space Multi-Task Model)

A multi-task model that addresses CVR's Sample Selection Bias and Data Sparsity problems simultaneously by exploiting the sequential user behavior $\text{impression} \to \text{click} \to \text{conversion}$ to learn CVR indirectly over the entire impression space.

2026-03-25 #recsys#representation-learning
DeepFM

DeepFM (Guo et al., 2017) is a CTR prediction model that combines an FM component and a Deep component in parallel, jointly learning low-order (explicit) and high-order (implicit) feature interactions.

2026-02-13 #recsys#factor-models#factorization-machine
Factorization Machine

The Factorization Machine (FM) is a general-purpose prediction model proposed by Rendle (2010) that models interactions between all pairs of features as inner products of latent factor vectors.

2026-02-13 #recsys#factor-models#factorization-machine
PNN

PNN (Qu et al., 2016) is a CTR prediction model that introduces a product layer between the embedding layer and the DNN hidden layers, explicitly capturing the interactions among feature embeddings before passing them to the DNN.

2026-02-13 #recsys#factor-models#product-network
Wide and Deep

Wide & Deep (Cheng et al., 2016) is a CTR prediction model that combines a linear wide component (memorization) with a DNN deep component (generalization). It was first deployed for Google Play app recommendation.

2026-02-13 #recsys#factor-models
Multi-Task Learning

A learning paradigm that jointly trains several related tasks, improving generalization through a shared representation.

2026-01-29 #recsys#multi-task-learning