The most versatile form of IPFS Gateway is a Path Gateway.
It exposes namespaces like /ipfs/
and /ipns/
under HTTP server root and
provides basic primitives for integrating IPFS resources within existing HTTP
stack.
Note: additional Web Gateways aimed for website hosting and web browsers extend the below spec and are defined in [subdomain-gateway] and [dnslink-gateway]. There is also a minimal [trustless-gateway] specification for use cases where client prefers to perform all validation locally.
Path Gateway provides HTTP interface for requesting content-addressed data at specified content path.
GET /ipfs/{cid}[/{path}][?{params}]
Downloads data at specified immutable content path.
cid
– a valid content identifier (CID)path
– optional path parameter pointing at a file or a directory under the cid
content rootparams
– optional query parameters that adjust response behaviorHEAD /ipfs/{cid}[/{path}][?{params}]
Same as GET, but does not return any payload.
Implementations SHOULD limit the scope of IPFS data transfer triggered by
HEAD
requests to a minimal DAG subset required for producing response headers
such as
X-Ipfs-Roots
,
Content-Length
and Content-Type
.
HTTP client can send HEAD
request with
Cache-Control: only-if-cached
to disable IPFS data transfer and inexpensively probe if the gateway has the data cached.
Implementation MUST ensure that handling only-if-cached
HEAD
response is
fast and does not generate any additional I/O such as IPFS data transfer. This
allows light clients to probe and prioritize gateways which already
have the data.
GET /ipns/{name}[/{path}][?{params}]
Downloads data at specified mutable content path.
Implementation must resolve the name
to a CID, then serve response behind a
/ipfs/{resolved-cid}[/{path}][?{params}]
content path.
name
may refer to:
HEAD /ipns/{name}[/{path}][?{params}]
Same as GET, but does not return any payload.
All request headers are optional.
If-None-Match
(request header)Used for HTTP caching.
Enables advanced cache control based on Etag
,
allowing client and server to skip data transfer if previously downloaded
payload did not change.
The Gateway MUST compare Etag values sent in If-None-Match
with Etag
that
would be sent with response. Positive match MUST return HTTP status code 304
(Not Modified), without any payload.
Cache-Control
(request header)Used for HTTP caching.
only-if-cached
Client can send Cache-Control: only-if-cached
to request data only if the
gateway already has the data (e.g. in local datastore) and can return it
immediately.
If data is not cached locally, and the response requires an expensive remote
fetch, a 412 Precondition Failed
HTTP status code
should be returned by the gateway without any payload or specific HTTP headers.
NOTE: when processing a request for a DAG, traversing it and checking every CID might be too expensive. Implementations SHOULD implement own heuristics to maximize cache hits while minimizing performance cost of checking if the entire DAG is locally cached. A good rule of thumb is to at the minimum test if the root block is in the local cache.
Accept
(request header)Can be used for requesting specific response format
For example:
dag-json
(0x0129) codec, data is validated as DAG-JSON before being returned as-is. Invalid DAG-JSON produces HTTP Error 500.dag-cbor
(0x71) codec, data is validated as DAG-CBOR before being returned as-is. Invalid DAG-CBON produces HTTP Error 500.application/vnd.ipld.dag-json
, unless the CID's codec already is json
(0x0200). Then, the raw JSON block can be returned as-is without any conversion.application/vnd.ipld.dag-cbor
, unless the CID's codec already is cbor
(0x51). Then, the raw CBOR block can be returned as-is without any conversion.Range
(request header)Range
can be used for requesting specific byte range of UnixFS files and raw
blocks.
Gateway implementations SHOULD be smart enough to require only the minimal DAG subset necessary for handling the range request.
Gateways SHOULD support single range requests. The support of more than one range is optional: implementation MAY decide to not support more than one range.
Service-Worker
(request header)Mentioned here for security reasons and should be implemented with care.
This header is sent by web browser attempting to register a service worker script for a specific scope. Allowing too broad scope can allow a single content root to take control over gateway endpoint. It is important for implementations to handle this correctly.
Service Worker should only be allowed under specific to content roots under
/ipfs/{cid}/
and /ipns/{name}/
(IMPORTANT: note the trailing slash).
Gateway should refuse attempts to register a service worker for entire
/ipfs/cid
or /ipns/name
(IMPORTANT: when trailing slash is missing).
Requests to these paths with Service-Worker: script
MUST be denied by
returning HTTP 400 Bad Request error.
All query parameters are optional.
filename
(request query parameter)Optional, can be used for overriding the filename.
When set, gateway will include it in Content-Disposition
header and may use
it for Content-Type
calculation.
Example:
https://ipfs.io/ipfs/QmfM2r8seH2GiRaC4esTjeraXEachRt8ZsSeGaWTPLyMoG?filename=hello_world.txt
download
(request query parameter)Optional, can be used to request specific Content-Disposition
to be set on the response.
Response to HTTP request with download=true
MUST include
Content-Disposition: attachment[;filename=...]
to indicate that client should not render the response.
The attachment
context will force user agents such as web browsers to present
a 'Save as' dialog instead (prefilled with the value of the filename
parameter, if present)
format
(request query parameter)Optional, format=<format>
can be used to request specific response format.
This is a URL-friendly alternative to sending an Accept
header.
These are the equivalents:
format=raw
→ Accept: application/vnd.ipld.raw
format=car
→ Accept: application/vnd.ipld.car
format=tar
→ Accept: application/x-tar
format=dag-json
→ Accept: application/vnd.ipld.dag-json
format=dag-cbor
→ Accept: application/vnd.ipld.dag-cbor
format=json
→ Accept: application/json
format=cbor
→ Accept: application/cbor
format=ipns-record
→ Accept: application/vnd.ipfs.ipns-record
dag-scope
(request query parameter)Only used on CAR requests, same as dag-scope from [trustless-gateway].
entity-bytes
(request query parameter)Only used on CAR requests, same as entity-bytes from [trustless-gateway].
200
OKThe request succeeded.
If the HTTP method was GET
, then data is transmitted in the message body.
206
Partial ContentPartial Content: range request succeeded.
Returned when requested range of data described by Range
header of the request.
301
Moved PermanentlyIndicates permanent redirection.
The new, canonical URL is returned in the Location
header.
400
Bad RequestA generic client error returned when it is not possible to return a better one
404
Not FoundError to indicate that request was formally correct, but traversal of the requested content path was not possible due to a invalid or missing DAG node.
410
GoneError to indicate that request was formally correct, but this specific Gateway refuses to return requested data.
Particularly useful for implementing deny lists, in order to not serve malicious content. The name of deny list and unique identifier of blocked entries can be provided in the response body.
See: Denylists
412
Precondition FailedError to indicate that request was formally correct, but Gateway is unable to return requested data under the additional (usually cache-related) conditions sent by the client.
Cache-Control: only-if-cached
only-if-cached
condition412
to the client
only-if-cached
is handled by
the gateway itself, moving the error to client error range and avoiding
confusing server errors in places like the browser console.429
Too Many RequestsError to indicate the client has sent too many requests in a given amount of time.
This error response SHOULD include Retry-After
HTTP header to indicate how long the client should wait before making a follow-up request.
500
Internal Server ErrorA generic server error returned when it is not possible to return a better one.
502
Bad GatewayReturned immediately when Gateway was not able to produce response for a known reason. For example, when gateway failed to find any providers for requested data.
This error response SHOULD include Retry-After
HTTP header to indicate how long the client should wait before retrying.
504
Gateway TimeoutReturned when Gateway was not able to produce response under set time limits. For example, when gateway failed to retrieve data from a remote provider.
There is no generic timeout, Gateway implementations SHOULD set timeouts based on specific use cases.
This error response SHOULD include Retry-After
HTTP header to indicate how long the client should wait before retrying.
Etag
(response header)Used for HTTP caching.
An opaque identifier for a specific version of the returned payload. The unique value must be wrapped by double quotes as noted in Section 8.8.3 of [rfc9110].
In many cases it is not enough to base Etag
value on requested CID.
To ensure Etag
is unique enough to avoid issues with caching reverse proxies
and CDNs, implementations should base it on both CID and response type:
By default, etag should be based on requested CID. Example: Etag: "bafy…foo"
If a custom format
was requested (such as a raw block, CAR), the
returned etag should be modified to include it. It could be a suffix.
Etag: "bafy…foo.raw"
If HTML directory index was generated by the gateway, the etag returned with HTTP response should be based on the version of gateway implementation. This is to ensure proper cache busting if code responsible for HTML generation changes in the future.
Etag: "DirIndex-2B423AF_CID-bafy…foo"
When a gateway can’t guarantee byte-for-byte identical responses, a “weak” etag should be used.
Etag: W/"bafy…foo.car"
.Etag: W/"bafy…foo.x-tar"
.When responding to Range
request, a strong Etag
should be based on requested range in addition to CID and response format:
Etag: "bafy..foo.0-42
Cache-Control
(response header)Used for HTTP caching.
An explicit caching directive for the returned response. Informs HTTP client and intermediate middleware caches such as CDNs if the response can be stored in caches.
Returned directive depends on requested content path and format:
Cache-Control: public, max-age=29030400, immutable
MUST be returned for
every immutable resource under /ipfs/
namespace.
Cache-Control: public, max-age=<ttl>
SHOULD be returned for mutable
resources under /ipns/{id-with-ttl}/
namespace; max-age=<ttl>
SHOULD
indicate remaining TTL of the mutable pointer such as [ipns-record] or DNSLink
TXT record.
Cache-Control
Last-Modified
header with the timestamp of the record resolution.Last-Modified
(response header)Optional, used as additional hint for HTTP caching.
Returning this header depends on the information available:
The header can be returned with /ipns/
responses when the gateway
implementation knows the exact time a mutable pointer was updated by the
publisher.
When only TTL is known, Cache-Control
should be used instead.
Legacy implementations set this header to the current timestamp when reading
TTL on /ipns/
content paths was not available. This hint was used by web
browsers in a process called "Calculating Heuristic Freshness"
(Section 4.2.2 of [rfc9111]). Each browser
uses different heuristic, making this an inferior, non-deterministic caching
strategy.
New implementations should not return this header if TTL is not known;
providing a static expiration window in Cache-Control
is easier to reason
about than cache expiration based on the fuzzy “heuristic freshness”.
Content-Type
(response header)Returned with custom response formats such as application/vnd.ipld.car
or
application/vnd.ipld.raw
. CAR must be returned with explicit version.
Example: Content-Type: application/vnd.ipld.car; version=1
When deserialized responses are enabled,
and no explicit response format is provided with the request, and the
requested data itself has no built-in content type metadata, implementations
SHOULD perform content type sniffing based on file name
(from url path, or optional filename
parameter)
and magic bytes to improve the utility of produced responses.
For example:
Content-Type: text/plain
instead of application/octet-stream
Content-Type: image/svg+xml
instead of text/xml
Content-Disposition
(response header)Returned when download
, filename
query parameter, or a custom response
format
such as car
or raw
block are used.
The first parameter passed in this header indicates if content should be
displayed inline
by the browser, or sent as an attachment
that opens the
“Save As” dialog:
Content-Disposition: inline
is the default, returned when request was made
with download=false
or a custom filename
was provided with the request
without any explicit download
parameter.Content-Disposition: attachment
is returned only when request was made with
the explicit download=true
The remainder is an optional filename
parameter that will be prefilled in the
“Save As” dialog.
NOTE: when the filename
includes non-ASCII characters, the header must
include both ASCII and UTF-8 representations for compatibility with legacy user
agents and existing web browsers.
To illustrate, ?filename=testтест.pdf
should produce:
Content-Disposition inline; filename="test____.jpg"; filename*=UTF-8''test%D1%82%D0%B5%D1%81%D1%82.jpg
_
UTF-8''
is not a typo – see Section 3.2.3 of [rfc8187].Content-Disposition
must be also set when a binary response format was requested:
Content-Disposition: attachment; filename="<cid>.car"
should be returned
with Content-Type: application/vnd.ipld.car
responses to ensure client does
not attempt to render streamed bytes. CID and .car
file extension should be
used if a custom filename
was not provided with the request.
Content-Disposition: attachment; filename="<cid>.bin"
should be returned
with Content-Type: application/vnd.ipld.raw
responses to ensure client does
not attempt to render raw bytes. CID and .bin
file extension should be used
if a custom filename
was not provided with the request.
Content-Length
(response header)Represents the length of returned HTTP payload.
NOTE: the value may differ from the real size of requested data if compression or chunked Transfer-Encoding
are used.
Content-Range
(response header)Returned only when request was a Range
request.
See Section 14.4 of [rfc9110].
Accept-Ranges
(response header)Optional, returned to explicitly indicate if gateway supports partial HTTP
Range
requests for a specific resource.
For example, Accept-Ranges: none
should be returned with
application/vnd.ipld.car
responses if the block order in CAR stream is not
deterministic.
Location
(response header)Returned only when response status code is 301
Moved Permanently.
The value informs the HTTP client about new URL for requested resource.
This header is more widely used in SUBDOMAIN_GATEWAY.md.
Gateway MUST return a redirect when a valid UnixFS directory was requested
without the trailing /
, for example:
https://ipfs.io/ipns/en.wikipedia-on-ipfs.org/wiki
(no trailing slash) will be HTTP 301 redirect with
Location: /ipns/en.wikipedia-on-ipfs.org/wiki/
X-Ipfs-Path
(response header)Used for HTTP caching and indicating the IPFS address of the data.
Indicates the original, requested content path before any path resolution and traversal is performed.
Example: X-Ipfs-Path: /ipns/k2..ul6/subdir/file.txt
X-Ipfs-Roots
(response header)Used for HTTP caching.
A way to indicate all CIDs required for resolving logical roots (path
segments) from X-Ipfs-Path
. The main purpose of this header is allowing HTTP
caches to make smarter decisions about cache invalidation.
Below, an example to illustrate how X-Ipfs-Roots
is constructed from X-Ipfs-Path
pointing at a DNSLink.
The traversal of /ipns/en.wikipedia-on-ipfs.org/wiki/Block_of_Wikipedia_in_Turkey
includes a HAMT-sharded UnixFS directory /wiki/
.
This header only cares about logical roots (one per URL path segment):
/ipns/en.wikipedia-on-ipfs.org
→ bafybeiaysi4s6lnjev27ln5icwm6tueaw2vdykrtjkwiphwekaywqhcjze
/ipns/en.wikipedia-on-ipfs.org/wiki/
→ bafybeihn2f7lhumh4grizksi2fl233cyszqadkn424ptjajfenykpsaiw4
/ipns/en.wikipedia-on-ipfs.org/wiki/Block_of_Wikipedia_in_Turkey
→ bafkreibn6euazfvoghepcm4efzqx5l3hieof2frhp254hio5y7n3hv5rma
Final array of roots:
X-Ipfs-Roots: bafybeiaysi4s6lnjev27ln5icwm6tueaw2vdykrtjkwiphwekaywqhcjze,bafybeihn2f7lhumh4grizksi2fl233cyszqadkn424ptjajfenykpsaiw4,bafkreibn6euazfvoghepcm4efzqx5l3hieof2frhp254hio5y7n3hv5rma
NOTE: while the first CID will change every time any article is changed, the last root (responsible for specific article or a subdirectory) may not change at all, allowing for smarter caching beyond what standard Etag offers.
X-Content-Type-Options
(response header)Optional, present in certain response types:
X-Content-Type-Options: nosniff
should be returned with
application/vnd.ipld.car
and application/vnd.ipld.raw
responses to
indicate that the Content-Type
should be
followed and not be changed. This is a security feature, ensures that
non-executable binary response types are not used in <script>
and <style>
HTML tags.Retry-After
(response header)Gateway returns this header with error responses such as 429 Too Many Requests
or 504 Gateway Timeout
.
The "Retry-After" header indicates how long the user agent ought to wait before making a follow-up request.
See Section 10.2.3 of [rfc9110].
Server-Timing
(response header)Optional. Implementations MAY use this header to communicate one or more metrics and descriptions for the given request-response cycle.
See Server-Timing
at W3C: Server Timing.
Traceparent
(response header)Optional. Implementations MAY use this header to return a globally unique identifier to help in debugging errors and performance issues.
See Traceparent
at W3C: Trace Context.
Tracestate
(response header)Optional. Implementations MAY use this header to return a additional
vendor-specific trace identification information across different distributed
tracing systems and is a companion header for the Traceparent
header.
See Tracestate
at W3C: Trace Context.
Data sent with HTTP response depends on the type of the requested IPFS resource, and the requested response type.
By default, implicit deserialized response type is based on Accept
header and the codec of the resolved CID:
dag-pb
(0x70) or raw
(0x55)
raw
block
Range
is present, only the requested byte range is returned.index.html
is present, gateway MUST skip generating directory index and return content from index.html
instead.raw
, but returned Content-Type
is application/json
raw
, but returned Content-Type
is application/cbor
Accept
header includes text/html
, implementation should return a generated HTML with options to download DAG-JSON as-is, or converted to DAG-CBOR.raw
block, but returned Content-Type
is application/vnd.ipld.dag-jsonAccept
header includes text/html
: implementation should return a generated HTML with options to download DAG-CBOR as-is, or converted to DAG-JSON.raw
block, but returned Content-Type
is application/vnd.ipld.dag-cborThe following response types require an explicit opt-in, can only be requested with format
query parameter or Accept
header:
?format=raw
)
?format=car
)
?format=tar
)
0x0300
)Content resolution is a process of turning an HTTP request into an IPFS content path, and then traversing it until the content identifier (CID) is found.
Path Gateway decides what content to serve by taking the path from the URL requested and splitting it into two parts: the CID and the remainder of the path.
The CID provides the starting point, often called content root. The remainder of the path, if present, will be used as instructions to traverse IPLD data, starting from that data which the CID identified.
Note: Other types of gateway may allow for passing CID by other means, such
as Host
header, changing the rules behind path splitting.
(See SUBDOMAIN_GATEWAY.md
and DNSLINK_GATEWAY.md).
After the content root CID is found, the remaining of the path should be traversed and resolved. Depending on the data type, that may occur through UnixFS pathing, or DAG-JSON, and DAG-CBOR pathing.
UnixFS is an abstraction over the low level logical DAG-PB pathing from IPLD, providing a better user experience:
/ipfs/cid/dir-name/file-name.txt
For more details regarding DAG-PB pathing, please read the "Path Resolution" section of this document.
Traversing through DAG-JSON and DAG-CBOR is possible through fields that encode a link:
/
reserved
namespace, see specification.Note: pathing into IPLD Kind other than Link (CID) is not supported at the moment. Implementations should return HTTP 501 Not Implemented when fully resolved content path has any remainder left. This feature may be specified in a future IPIP that introduces data onboarding and IPLD Patch semantics.
Gateway MUST respond with HTTP error when it is not possible to traverse the requested content path:
404 Not Found
should be returned when the root CID is valid and traversable, but
the DAG it represents does not include content path remainder.
/ipfs/{cid}/path/to/file
) is missing400 Bad Request
should be returned when the root CID under the ipfs
namespace is invalid.500 Internal Server Error
can be used for remaining traversal errors,
such as domains that cannot be resolved, or IPNS keys that cannot be resolved.Following HTTP Caching
rules around Etag
, Cache-Control
, If-None-Match
and Last-Modified
should be produce acceptable cache hits.
Advanced caching strategies can be built using additional information in
X-Ipfs-Path
and X-Ipfs-Roots
headers.
Implement support for requests sent with
Cache-Control: only-if-cached
.
It allows IPFS-aware HTTP clients to probe and prioritize gateways that
already have the data cached, significantly improving retrieval speeds.
Optional, but encouraged.
Implementations are encouraged to support pluggable denylists to allow IPFS node operators to opt into not hosting previously flagged content.
Gateway MUST respond with HTTP error when requested CID is on any of active denylists:
Gateway implementation MAY apply some denylists by default as long the gateway operator is able to inspect and modify the list of denylists that are applied.
Examples of public deny lists
sha256()
hashed so that it can easily be checked given
a plaintext CID, but inconvenient to determine otherwise.While implementations decide on the way HTML directory listing is generated and presented to the user, following below suggestions is advised.
Linking to alternative response types such as CAR and dag-json allows clients to consume directory listings programmatically without the need for parsing HTML.
Directory index response time should not grow with the number of items in a directory. It should be always fast, even when a directory has 10k of items.
The usual optimizations involve:
Skipping size and type resolution for child UnixFS items, and using Tsize
from logical format instead, allows gateway to respond much
faster, as it no longer need to fetch root nodes of child items.
bafybeiggvykl7skb2ndlmacg2k5modvudocffxjesexlod2pfvg5yhwrqm
.Alternative approach is resolving child items, but providing pagination UI.
?page=0&limit=100
),
limiting the cost of a single page load.