Skip to main content

Product names

The product name should be short, clear, and describe the product itself, not its variants. Do not include color, size, or other variable features in the name. Avoid repeated words, unnecessary descriptors, special characters, and separators. Each product should have one logical name, regardless of the number of variants.

Product descriptions

The description should be unique, detailed, and written in natural language. Use several full sentences that explain the product’s use, features, and context. Do not copy descriptions between products or rely only on keyword lists. The description is the best place for synonyms and language your customers use.

Product attributes

Attributes should come only from dedicated attribute fields, not from names or descriptions. Use consistent attribute names and value formats across your catalog. Do not include marketing text or long descriptions in attributes—use only specific, clear values.

Products variants

Differences such as color, size, or other options should be handled as variants, not as separate products with different names. The product name should be the same for all variants. If variants are indexed separately, shared data (description, category, product type) must be identical.

Categories

Assign each product a logical and consistent category. Categories should be hierarchical and used consistently throughout the store. Do not use categories as tags or to describe product features. Semantic duplicates and different names for the same category reduce search quality.

Language consistency

Product data should use a single language within one index. Do not mix languages in names, descriptions, or attributes. Place synonyms and alternative names in descriptions or dedicated fields, not in the product name.

Data normalization

Before indexing, data should be cleaned and standardized. This includes letter case, extra spaces, typos, and value formatting. The same information must appear identically for all products so the search engine can interpret it correctly.

Uniqueness and completeness

Each product should have a unique identifier and a complete set of key data, such as name, description, and category. Empty fields, placeholder values, or technical placeholders lower search quality.

Numeric and structured data

Prices, weights, dimensions, and other numeric values should be stored in numeric fields. They may include additional text or indexes, but this must be standardized for all products. Dates should use a single, consistent format.

Data approach

The search engine uses only the provided data and does not interpret context. The more organized, predictable, and semantically clean the product data, the better the search, ranking, and filtering results.
  • Product names should be short, clear, and not include variants (color, size, etc.).
  • Descriptions must be unique, longer, and written in natural language—not just keyword lists.
  • Keep attributes only in attribute fields, with one consistent name and value format.
  • Handle variants as variants, not as separate products with different names.
  • Categories should be logical, consistent, and used throughout the store.
  • Use a single data language—do not mix languages in names, descriptions, or attributes.
  • Clean data: no typos, repetitions, unnecessary characters, or placeholders.
  • Each product must have all key information and a unique ID.
  • Numeric values (prices, dimensions, weights) should be numbers; they may include words, but this must be standardized for all products.
  • The simpler and more organized the data, the better the search and filtering results.