Skip to content

Conversation

@jimstir
Copy link
Collaborator

@jimstir jimstir commented Sep 26, 2025

RFC for Codex manifest....

@jimstir jimstir marked this pull request as draft September 26, 2025 13:37
@jimstir jimstir marked this pull request as ready for review October 16, 2025 21:01
@jimstir jimstir requested a review from emizzle October 24, 2025 15:10
Copy link

@emizzle emizzle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather see the Rationale section of the original spec in the Background, since it is more cohesive and in depth.

Also, there are some functional and non-functional requirements from the original spec that could be translated into specs here.

Comment on lines +25 to +28
In version 1 of the BitTorrent protocol a user wants to upload (seed) some content to the BitTorrent network,
the client chunks the content into pieces.
For each piece, a hash is computed and
included in the pieces attribute of the info dictionary in the BitTorrent metainfo file.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a bit irrelevant imo.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed, but I think it was a good intro to how the Bittorent metainfo file is constructed compare to the Codex manifest. Also helps reader get introduced to the concepts.

Comment on lines 18 to 39
The Codex manifest provides the description of the metadata uploaded to the Codex network.
It is in many ways similar to the BitTorrent metainfo file also known as .torrent files,
(for more information see, [BEP3](http://bittorrent.org/beps/bep_0003.html) from BitTorrent Enhancement Proposals (BEPs).
While the BitTorrent metainfo files are generally distributed out-of-band,
Codex manifest receives its own content identifier, [CIDv1](https://github.com/multiformats/cid#cidv1), that is announced on the Codex DHT, also
see the [CODEX-DHT specification](./dht.md).

In version 1 of the BitTorrent protocol a user wants to upload (seed) some content to the BitTorrent network,
the client chunks the content into pieces.
For each piece, a hash is computed and
included in the pieces attribute of the info dictionary in the BitTorrent metainfo file.
In Codex,
instead of hashes of individual pieces,
we create a Merkle Tree computed over the blocks in the dataset.
We then include the CID of the root of this Merkle Tree as treeCid attribute in the Codex Manifest file.

In version 2 of the BitTorrent protocol also uses Merkle Trees and
includes the root of the tree in the info dictionary for each .torrent file.

The Codex manifest, CID in particular,
is the ability to uniquely identify the content and
be able to retrieve that content from any single Codex client.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the Rationale section from the component spec instead? These paragraphs lack depth that is presented in the component spec.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this info came from the rationale section, I just titled it Background. Updating the title of section to Rationale is probably better.


| attribute | type | description |
|-----------|------|-------------|
| `treeCid` | string | A hash based on [CIDv1](https://github.com/multiformats/cid#cidv1) of the root of the [Codex Tree], which is a form of a Merkle Tree corresponding to the dataset described by the manifest. Its multicodec is (codex-root, 0xCD03) |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protobuf shows this as bytes, also the codex specs as well

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually most of the fields have wrong types here

@@ -0,0 +1,94 @@
---
title: CODEX-MANIFEST
name: Codes Manifest

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo? Codes

@fbarbu15
Copy link

fbarbu15 commented Nov 5, 2025

@jimstir can you please check the comments and fix linting issues?

@jimstir
Copy link
Collaborator Author

jimstir commented Nov 5, 2025

@emizzle The other functional requirements I think you are referring to, like how a file would be downloaded with the manifest from the network, I don't think should be added to this spec.

@jimstir jimstir requested review from emizzle and fbarbu15 November 5, 2025 15:15
Copy link

@fbarbu15 fbarbu15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks

@jimstir jimstir requested a review from fbarbu15 November 14, 2025 15:45
@fbarbu15 fbarbu15 requested a review from Cofson November 18, 2025 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants