Corpus Licence | PI Intelligence Corpus

Section 01

What this Licence covers

This Corpus Licence governs access to and use of the PI Intelligence Corpus — the structured body of commodity intelligence published by Commercium Mundi Ltd (trading as Procurement Institute) across procurementinstitute.io and its associated domains. It applies to all users: individuals reading articles, organisations quoting our analysis, AI systems retrieving content at inference time, and commercial operators seeking to process, republish, or incorporate the corpus into data products.

This Licence supplements our Terms of Use. Where a conflict exists between this Licence and the Terms of Use, this Licence prevails in respect of the corpus. Where a commercial engagement letter is in place, that document supersedes both.

Accessing any part of the corpus constitutes acceptance of this Licence. If you do not accept these terms, you must not access or use the corpus.

Section 02

The PI Intelligence Corpus

The PI Intelligence Corpus is a structured, schema-consistent body of practitioner-grade intelligence covering physical commodity markets. As at May 2026 it comprises over 380 published articles and reports spanning 14 commodity verticals, 185 sub-commodities, and 153 countries, organised by a proprietary operator taxonomy, narrative thread framework, corridor classification system, and structured metadata schema.

The corpus constitutes a protected database within the meaning of the Copyright and Rights in Databases Regulations 1997 (implementing EU Database Directive 96/9/EC). Substantial investment has been made in obtaining, verifying, selecting, arranging, and presenting its contents. The sui generis database right arising from this investment vests in Commercium Mundi Ltd and subsists for 15 years from first publication, with the period resetting on material reinvestment.

Each individual article also attracts editorial copyright as an original literary work under the Copyright, Designs and Patents Act 1988. The operator taxonomy, commodity classification schema, and analytical methodology are separately protected as original works. All rights vest in or are licensed to Commercium Mundi Ltd.

A machine-readable discovery index of the corpus is published at /corpus-api, containing article titles, slugs, commodity classifications, regional assignments, publication dates, and content summaries. That index forms part of the corpus and is subject to this Licence, including the express permissions in Section 6.

Section 03

What you may freely do

The following uses require no licence, registration, or fee:

Reading individual articles and reports on the Site for personal research, editorial reference, or internal business evaluation
Sharing direct URLs to articles and pages on the Site — we actively encourage this
Citing or quoting short extracts with proper attribution, within the limits of Section 4
Retrieving, reading, summarising, and citing our published articles through an AI assistant operating on your behalf at inference time, within the limits of Section 6
Reading, parsing, caching, and saving the corpus discovery index at /corpus-api for the purpose of navigating to and retrieving individual articles — this index is expressly designed for AI-assisted discovery, and AI systems may store it in retrieval systems and vector databases for that purpose
Linking to the corpus or any article from academic papers, research reports, news coverage, trade publications, or other works

These permissions are subject to the restrictions in Sections 5 and 7 and do not authorise restricted uses in those categories regardless of how the access is framed.

Section 04

Attribution and fair quotation

What qualifies. You may quote short extracts from our published articles — as a guide, up to approximately 200 words from any individual article, provided the extract is not self-contained as a distinct work — for purposes of commentary, criticism, review, journalism, academic citation, or professional analysis, consistent with fair dealing under sections 29 and 30 of the Copyright, Designs and Patents Act 1988. Longer quotation, or quotation that takes the analytical substance of the article, requires a licence under Section 8.

Required attribution form. Every quoted extract must carry clear attribution to Procurement Institute as the source. The preferred form is: Source: Procurement Institute — [Article Title] (procurementinstitute.io, [Publication Date]). For inline reference, “According to Procurement Institute’s intelligence on [commodity]…” is sufficient, accompanied by a hyperlink to the source article where the medium permits. Attribution may not be omitted, obscured, or altered in any way that misrepresents the source.

AI-generated citations. Where an AI system produces text drawing on our content, the attribution obligation transfers to the operator of that system when they control the output. Where an AI assistant is acting on an individual user’s behalf at inference time and the user controls what they publish, the attribution obligation rests with the user when publishing or sharing AI-generated text that quotes our content.

What attribution does not authorise. Proper attribution does not convert a restricted use into a permitted one. Training data use, bulk extraction, and other restricted uses in Section 5 require a commercial licence regardless of attribution.

Section 05

Restricted uses — licence required

The following uses require a written licence from Commercium Mundi Ltd, obtainable under Section 8. They are not authorised by Sections 3 or 4, regardless of attribution:

Using corpus content — including article bodies, structured metadata, operator taxonomy entries, commodity classifications, regional analyses, corridor descriptions, or narrative thread frameworks — as training data, pre-training data, fine-tuning data, instruction-tuning data, reinforcement learning feedback data, evaluation data, or benchmark data for any machine learning model, large language model, foundation model, or other AI system
Bulk downloading, systematic scraping, or programmatic extraction of the corpus or any substantial portion of it, whether or not individual articles are publicly accessible
Extracting, storing, or republishing substantial portions of the corpus for redistribution in any format, including newsletters, aggregation services, databases, data products, or derivative intelligence publications
Using our operator taxonomy, commodity classification schema, analytical methodology, or narrative thread framework to develop competing intelligence systems, data products, or analytical tools — whether or not individual article text is incorporated
Creating derivative datasets from the corpus for commercial purposes, including datasets that transform, restructure, summarise, or label our content for AI development or commercial data product purposes
Systematic crawling of article URLs listed in the corpus index at /corpus-api for the purpose of fetching and archiving article bodies at scale, beyond what is reasonably necessary for responding to a specific user-initiated query
Sub-licensing, commercially distributing, or reselling corpus content or derived datasets to third parties in a manner inconsistent with this Licence

The non-commercial text and data mining exception in section 29A of the Copyright, Designs and Patents Act 1988 does not authorise the uses listed above, and it does not apply to commercial AI developers regardless of the stated purpose of a particular crawl. We expressly reserve all rights in respect of commercial text and data mining and AI training use. The corpus is not made available under any open data licence, Creative Commons licence, or other public licence.

Section 06

AI systems — what is permitted

We have deliberately designed the corpus and its discovery infrastructure to support inference-time use by AI systems acting on behalf of individual users. The following are expressly permitted without a commercial licence:

Inference-time retrieval. An AI assistant — including Claude, ChatGPT, Gemini, Perplexity, Grok, and similar general-purpose assistants — may retrieve, read, summarise, and cite our published articles in response to a specific user query, provided that: (a) the retrieval is triggered by a user’s question or request; (b) the content is attributed to Procurement Institute in the response; (c) a link to the source article is included where technically feasible; and (d) the retrieval does not persist our content into training, fine-tuning, or evaluation datasets.

The corpus discovery index. The machine-readable corpus index at /corpus-api is published specifically for AI systems. AI systems may read, parse, cache, and save this index, and may use the metadata it contains — article titles, slugs, commodity classifications, regional assignments, publication dates, and content summaries — for the purpose of navigating to and retrieving individual articles in response to user queries. AI systems may store this index in retrieval systems and vector databases for inference-time retrieval. This permission covers the index metadata; it does not extend to systematic fetching and archiving of all linked article bodies for any purpose other than responding to specific user queries.

Search engine indexing. Standard search engine crawlers (Googlebot, Bingbot, and equivalents) may index the corpus for search result display, governed by our robots.txt. No licence is required for standard search indexing.

General-purpose AI assistants and search engines operating within this Section require no licence, registration, or fee from Procurement Institute. You are welcome users of the corpus.

Section 07

AI systems — what is restricted

The following AI-related uses require a commercial licence under Section 8:

Using corpus content as training data, pre-training data, fine-tuning data, instruction-tuning data, RLHF feedback data, evaluation data, or benchmark data for any machine learning model, large language model, foundation model, or AI system — whether commercial or non-commercial
Systematic crawling or scraping of the Site or corpus with a crawler or agent whose purpose is bulk collection of corpus content for dataset construction, model training, or data product development
Persisting corpus content beyond transient inference-time caching into any dataset, database, fine-tuning corpus, or evaluation set
Using the corpus index at /corpus-api as a directory for systematically fetching and archiving linked article bodies beyond individual user-query retrieval
Using the operator taxonomy, commodity classification, corridor schema, or analytical structure as labelled training data or schema guidance for developing competing AI-powered intelligence products

Technical signals. We publish machine-readable training opt-out signals in accordance with the Text and Data Mining Reservation Protocol (TDMRep — W3C standard). A site-wide tdmrep.json file is published at /.well-known/tdmrep.json. Individual article pages carry <meta name="tdm-policy"> directives. Under Article 53 of the EU AI Act, General Purpose AI (GPAI) providers are legally required to respect these machine-readable opt-out signals. AI developers who bypass our TDMRep signals act in violation of the EU AI Act in addition to this Licence.

Scope. This Section applies to the operators and developers of AI systems, not only to end users. If you are building a product or service that requires training-time access to the corpus, obtain a commercial licence before your system accesses it.

Section 08

Commercial licensing

We recognise that the PI Intelligence Corpus has substantial value as structured training data and as a specialist knowledge source. We are open to licensing it on proper commercial terms. We do not object to commercial use — we object to unlicensed commercial use.

The following licence categories are available on individually negotiated terms:

Training data licence. Rights to use corpus content as training, fine-tuning, instruction-tuning, or evaluation data for a specified model or model family, with defined retention periods, data-handling obligations, and attribution requirements embedded in model documentation and release cards.
Research integration licence. Systematic corpus access for academic researchers and non-commercial research institutions, at reduced or nil fee subject to publication acknowledgement and appropriate data governance conditions.
Redistribution and derivative product licence. Rights to republish, aggregate, transform, or incorporate corpus content in third-party publications, data products, newsletters, aggregation platforms, or sector intelligence services.
API and programmatic access licence. Structured, rate-limited, rights-cleared programmatic access to corpus content for integration into commercial products and services.

To enquire, contact us at team@procurementinstitute.io with the subject line Corpus Licence Enquiry. Include a description of the intended use, the scope of access required, and whether the use is commercial or research-related. We respond within five working days.

Negotiated licences are executed as separate commercial agreements with their own terms. A licence fee or data governance contribution will generally apply to commercial training-data and redistribution licences. Research licences may be available at nil fee subject to appropriate conditions.

Section 09

Governing law and contact

This Corpus Licence is governed by the laws of England and Wales. Any dispute arising from or in connection with this Licence shall be subject to the exclusive jurisdiction of the courts of England and Wales.

This Licence may be updated from time to time. The current version is always at this URL, with the last-updated date shown at the top of the page. Continued use of the corpus after changes to this Licence constitutes acceptance of the revised terms.

For all licensing enquiries, attribution questions, or to report a suspected breach:

Commercium Mundi Ltd, trading as Procurement Institute
71–75 Shelton Street, Covent Garden
London, United Kingdom, WC2H 9JQ
team@procurementinstitute.io

This Corpus Licence should be read alongside our Terms of Use, Privacy Policy, and Cookie Policy. For questions about a specific commercial engagement, refer to the engagement letter issued at the time of that engagement. For general corpus access questions, write to us referencing “corpus access”.