Skip to main content

Product names

The product name should be short, clear, and describe the product itself, not its variants. Do not include color, size, or other variable features in the name. Avoid repeated words, unnecessary descriptors, special characters, and separators. Each product should have one logical name, regardless of the number of variants.
Cotton T-Shirt
Cotton T-Shirt - Black / M | PROMO!!!

Product descriptions

The description should be unique, detailed, and written in natural language. Use several full sentences that explain the product’s use, features, and context. Do not copy descriptions between products or rely only on keyword lists. The description is the best place for synonyms and language your customers use.
Soft cotton T-shirt with a classic fit. Great for everyday wear and easy to layer under a jacket. Breathable fabric with reinforced seams for better durability.
T-shirt, cotton, men, black, M, L, XL, best quality, cheap

Product attributes

Attributes should come only from dedicated attribute fields, not from names or descriptions. Use consistent attribute names and value formats across your catalog. Do not include marketing text or long descriptions in attributes—use only specific, clear values.
  • color: black
  • material: 100% cotton
  • fit: regular
  • color: black, perfect gift, bestseller, shipping 24h
  • material: cotton!!! top quality!!!

Products variants

Differences such as color, size, or other options should be handled as variants, not as separate products with different names. The product name should be the same for all variants. If variants are indexed separately, shared data (description, category, product type) must be identical.
  • Product: Cotton T-Shirt
  • Variants: color: black, size: S / M / L / XL (same name; shared fields like description/category are identical)
Separate products: Cotton T-Shirt Black M, Cotton T-Shirt White L

Categories

Assign each product a logical and consistent category. Categories should be hierarchical and used consistently throughout the store. Do not use categories as tags or to describe product features. Semantic duplicates and different names for the same category reduce search quality.
Apparel > T-Shirts > Men
Awesome T-Shirts, Black, Gym wear, Bestsellers (tags/features, not categories)

Language consistency

Product data should use a single language within one index. Do not mix languages in names, descriptions, or attributes. Place synonyms and alternative names in descriptions or dedicated fields, not in the product name.
Do not mix languages and styles of expression.

Data normalization

Before indexing, data should be cleaned and standardized. This includes letter case, extra spaces, typos, and value formatting. The same information must appear identically for all products so the search engine can interpret it correctly.
  • color: black (always the same spelling/case)
  • size: 42 (consistent format)
  • color: Black, black , BLACK, blk
  • size: 42, 42 EU, EU42

Uniqueness and completeness

Each product should have a unique identifier and a complete set of key data, such as name, description, and category. Empty fields, placeholder values, or technical placeholders lower search quality.
  • id: 834719
  • name: City Backpack
  • description: …
  • category: Accessories > Backpacks
  • price_usd: 79.99
  • id: null or id: test123
  • description: TBD / - / empty
  • missing category
  • price: $0 as a placeholder

Numeric and structured data

Prices, weights, dimensions, and other numeric values should be stored in numeric fields. They may include additional text or indexes, but this must be standardized for all products. Dates should use a single, consistent format.
  • price_usd: 199.99
  • weight_kg: 0.45
  • length_cm: 30, width_cm: 20, height_cm: 10
  • release_date: 2026-02-14
  • price: “$199.99 incl. tax” (text instead of numeric field)
  • weight: “0.45 kg / lightweight”
  • dimensions: “30x20x10cm approx.” (inconsistent/free text)
  • dates mixed as 14/02/26, 2026.02.14, Feb 14th 2026

Data approach

The search engine uses only the provided data and does not interpret context. The more organized, predictable, and semantically clean the product data, the better the search, ranking, and filtering results.
The search engine uses exactly the data you provide. Keep names stable, store options as variants, and keep attributes consistent so search, ranking, and filters work reliably.
Stuffing everything into the name/description (“BLACK M SALE HIT!!!”) and mixing marketing text into attributes makes the data unpredictable and hurts search quality.
  • Product names should be short, clear, and not include variants (color, size, etc.).
  • Descriptions must be unique, longer, and written in natural language—not just keyword lists.
  • Keep attributes only in attribute fields, with one consistent name and value format.
  • Handle variants as variants, not as separate products with different names.
  • Categories should be logical, consistent, and used throughout the store.
  • Use a single data language—do not mix languages in names, descriptions, or attributes.
  • Clean data: no typos, repetitions, unnecessary characters, or placeholders.
  • Each product must have all key information and a unique ID.
  • Numeric values (prices, dimensions, weights) should be numbers; they may include words, but this must be standardized for all products.
  • The simpler and more organized the data, the better the search and filtering results.
TL;DR Correct data example
{
  "id": "SKU-834719",
  "name": "Cotton T-Shirt",
  "description": "Soft cotton T-shirt with a classic fit. Great for everyday wear and easy to layer under a jacket. Breathable fabric with reinforced seams for better durability.",
  "category": "Apparel > T-Shirts > Men",
  "attributes": {
    "material": "100% cotton",
    "fit": "regular",
    "brand": "Northwind",
    "color": "black",
    "size": "M"
  },
  "variants": [
    { "variant_id": "SKU-834719-BLK-S", "color": "black", "size": "S", "price_usd": 24.99, "weight_kg": 0.20 },
    { "variant_id": "SKU-834719-BLK-M", "color": "black", "size": "M", "price_usd": 24.99, "weight_kg": 0.20 },
    { "variant_id": "SKU-834719-BLK-L", "color": "black", "size": "L", "price_usd": 24.99, "weight_kg": 0.21 },
    { "variant_id": "SKU-834719-WHT-M", "color": "white", "size": "M", "price_usd": 24.99, "weight_kg": 0.20 }
  ],
  "structured": {
    "release_date": "2026-02-14",
    "dimensions_cm": { "length": 28, "width": 22, "height": 2 }
  }
}
TL;DR Incorrect data example
{
  "id": "test123",
  "name": "Cotton T-Shirt - BLACK / M | SALE!!! #1 BESTSELLER",
  "description": "Cotton, t-shirt, men, BLACK, M/L/XL, cheap, best quality, breathable upper, szybka wysyłka 24h",
  "category": "Black / Bestsellers / Gym",
  "attributes": {
    "color": "Black, perfect gift, bestseller, shipping 24h!!!",
    "size": "EU 42 / M",
    "material": "cotton!!! top quality!!!",
    "price": "$24.99 incl. tax",
    "weight": "0.2 kg / lightweight"
  },
  "variants": [
    { "id": "Cotton T-Shirt Black M", "price": "$24.99" },
    { "id": "Cotton T-Shirt White L", "price": "$26.99" }
  ],
  "release_date": "14/02/26",
  "dimensions": "28x22x2cm approx."
}