PNN
Definition
PNN (Qu et al., 2016) is a CTR prediction model that introduces a product layer between the embedding layer and the DNN hidden layers, explicitly capturing the interactions among feature embeddings before passing them to the DNN.
Product Layer
The pairwise interactions among embedding vectors are computed with a product operation:
The output of the product layer is the combination of the linear signal and the product signal :
Variants
Depending on the definition of the product operation , there are three variants:
| Variant | Product operation | Complexity | Characteristics |
|---|---|---|---|
| IPNN | Inner product: | Scalar output; FM-like interaction | |
| OPNN | Outer product: | Matrix output; richer interaction, higher cost | |
| PNN* | Combination of inner + outer product | Hybrid; combines the strengths of both approaches |
Here is the number of fields and is the embedding dimension.
Intuitive Understanding
If embeddings are simply concatenated and fed into a DNN, the network has to learn the feature interactions implicitly. PNN precomputes the products of embedding pairs and provides them to the DNN as “hints.”
By analogy, instead of having the chef judge the compatibility of two ingredients on their own (DNN), the ingredient combinations are tasted in advance and a compatibility score is provided alongside (the product layer). The chef can use this information to make better decisions.
Advantages and Disadvantages
Advantages:
- No pre-training required — end-to-end training is possible (an advantage over FNN)
- The product layer captures feature interactions explicitly
- IPNN combines FM-like interactions with a DNN
Disadvantages:
- Ignores low-order interactions (order-1, 2): As the product layer’s output passes through the DNN, the original low-order signal can be distorted. Unlike FM, low-order terms are not reflected directly in the output
- OPNN computational cost: The outer product is , which becomes inefficient as the number of features and the embedding dimension grow
- The pairwise computation in the product layer grows quadratically with the number of fields
Related Concepts
- DeepFM - Parallel combination of FM (low-order) + DNN (high-order); complements the low-order interactions that PNN misses
- Factorization Machine - The inner product of IPNN is similar to FM’s second-order interaction
- Wide and Deep - Parallel Wide (memorization) + Deep (generalization); a different design philosophy from PNN
- FNN - FM pre-training + DNN; PNN replaces pre-training with a product layer instead
Key Papers
- Qu, Y., et al. (2016). Product-based neural networks for user response prediction. ICDM 2016. — the original PNN paper
- guoDeepFMFactorizationMachineBased2017 - DeepFM; addresses PNN’s limitation (absence of low-order terms) by combining with FM