PayPal and NVIDIA Accelerate Commerce Agent Performance with Fine-Tuned Nemotron Model

Global: PayPal and NVIDIA Deploy Fine-Tuned Nemotron Model to Accelerate Commerce Agent Performance

In December 2025, researchers from PayPal and NVIDIA announced the development of an enhanced version of PayPal’s Commerce Agent, a multi‑agent system intended to streamline agentic commerce on the PayPal platform. Leveraging NVIDIA’s NeMo Framework, the team fine‑tuned a Nemotron small language model (SLM) to improve the system’s search and discovery capabilities, aiming to reduce latency and operational costs while preserving overall quality.

Background and Motivation

PayPal’s Commerce Agent handles a wide range of e‑commerce interactions, with the retrieval component accounting for more than 50% of total response time. The high latency of this component has prompted the need for targeted optimization to meet the performance expectations of modern online shoppers and merchants.

Collaboration with NVIDIA and the NeMo Framework

The partnership between PayPal and NVIDIA enabled the application of the NeMo Framework—traditionally used for speech and language research—to a commerce‑specific context. This marks the first reported use of NeMo for large‑scale e‑commerce agent optimization, according to the authors.

Model Fine‑Tuning Methodology

Researchers employed the llama3.1‑nemotron‑nano‑8B‑v1 architecture as the base, applying Low‑Rank Adaptation (LoRA) techniques to create a suite of fine‑tuned models. Systematic hyperparameter sweeps explored learning rates, optimizer choices (Adam and AdamW), cosine annealing schedules, and LoRA rank values to identify the most effective configuration.

Experimental Results

The fine‑tuned Nemotron SLM demonstrated a significant reduction in the retrieval component’s latency, surpassing the 50% threshold cited by the authors. In addition to speed gains, the approach lowered computational cost without degrading, and in some cases improving, the quality of agent responses.

Implications for E‑Commerce and Multi‑Agent Systems

These findings suggest that LLM‑driven fine‑tuning can be a viable strategy for large‑scale commercial agents, offering a scalable pathway to enhance performance in production environments. The reported framework may serve as a template for other organizations seeking to integrate advanced language models into their service architectures.

Future Directions

The authors indicate plans to extend the optimization process to additional agents within PayPal’s ecosystem and to explore broader applications of NeMo‑based fine‑tuning across diverse commerce scenarios.

This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.

PayPal and NVIDIA Deploy Fine-Tuned Nemotron Model to Accelerate Commerce Agent Performance