Time To First Byte
Spark measures Time to First Byte (TTFB) by aggregating median retrieval times from globally distributed checker nodes, ensuring accurate latency assessment for storage providers.
Introduction
Time to First Byte (TTFB) is a measurement used to indicate the responsiveness of a web server or other network resource. In the context of Spark, TTFB represents the median time it takes for checker nodes to receive the first byte when retrieving content from a storage provider (SP). These TTFB values are aggregated by storage provider on a daily basis.
Checker nodes are globally distributed, and TTFB can vary significantly among them due to geographic location and added latency. Latency refers to the time required for checker nodes to traverse the internet and reach the storage provider.
How do checker nodes measure TTFB?
Storage providers offer two protocols for checker nodes to retrieve content: IPFS Trustless Gateway and GraphSync. The protocol used by the storage provider affects how the checker node measures the time to first byte (TTFB).
- When a checker node retrieves content via IPFS Trustless Gateway, TTFB is the time between the request for a file and the receipt of the first byte from the storage provider.
- When using GraphSync, TTFB is measured at the point when the third-party library handling the GraphSync retrieval returns the first byte, which happens after the entire file has been downloaded.
Calculating Storage Provider TTFB
For each Spark task, a set of checker nodes records the time it took for them to receive the first byte from the storage provider. They then report this data back to Spark. We define a committee as the set of checkers who submitted a measurement for a given task.
The measurements from each committee are collected and evaluated. After evaluation, valid measurements are selected to calculate the median time to first byte for a retrieval. We use the median to reduce the impact of outliers, both from measurements reported by bad actors and from naturally occurring network latencies.
The median TTFB for a committee is calculated as:
All calculated TTFB values are collected and grouped by date and storage provider. These collected values are then used to produce the final storage provider TTFB value by calculating the median of these collected values:
Calculating Spark TTFB
Similar to Storage Provider TTFB, the Spark TTFB value is determined for a given date by taking the median of all calculated storage provider TTFB values .
← Previous
Next →