Skip to content

feat(DASH): Add period caching to speed up manifest parsing#9353

Open
grabofus wants to merge 9 commits intoshaka-project:mainfrom
DiceTechnology:feat/period-cache
Open

feat(DASH): Add period caching to speed up manifest parsing#9353
grabofus wants to merge 9 commits intoshaka-project:mainfrom
DiceTechnology:feat/period-cache

Conversation

@grabofus
Copy link
Contributor

@grabofus grabofus commented Nov 4, 2025

Adds period caching to speed up manifest parsing. The aim of this feature is to improve parsing of length multi-period DASH manifests on low power devices.

@shaka-bot
Copy link
Collaborator

Incremental code coverage: 52.21%

@avelad
Copy link
Member

avelad commented Nov 5, 2025

Why create a configuration and not always enable it?

@avelad avelad requested a review from tykus160 November 5, 2025 10:52
@avelad avelad added type: enhancement New feature or request priority: P3 Useful but not urgent component: DASH The issue involves the MPEG DASH manifest format labels Nov 5, 2025
@avelad avelad added this to the v5.0 milestone Nov 5, 2025
@tykus160
Copy link
Member

tykus160 commented Nov 5, 2025

@grabofus thanks for your PR, finding out serious bottleneck and trying to find a way to overcome it! Your change looks OK in general, however let me share another idea I have about it.

Personally I don't think we should maintain such a cache. Preferably, I think parser should be a bit smarter and work on some assumptions. The overall expectation in live updates should be that either some data has been removed from the beginning of a file, or added at the end of it. These assumptions are already used silently in other parts of code in shaka, i.e. SegmentIndex, PeriodCombiner or MPD Patch implementation. In my opinion live updates logic should leverage based on those assumptions.

Due to my thoughts above, I'm also not sure would we need to calculate hash in a way introduced here, as it's basically DASH agnostic. In doing such live updates we should instead check period characteristics, in particular did any timeline has changed in starting or ending periods.

Another idea would be to lazy parse periods, but that's definitely not in scope of this work.

That's my general thought that I wanted to share with community. If we decide this approach is a better way to go, I will of course review it through-and-through.

cc @shaka-project/shaka-player

@Murmur
Copy link

Murmur commented Nov 6, 2025

The overall expectation in live updates should be that either some data has been removed from the beginning of a file, or added at the end of it. | we should instead check period characteristics, in particular did any timeline has changed in starting or ending periods.

Yes and I think DASH specs has a text saying "middle periods" should/shall not be modified after introduced in a live streaming.

  • First period (oldest): expiration of old segments in a timeline ok (make a period shorter)
  • last period (latest): append new segments in a timeline is ok (make a period longer)

There might be live streams available that may change "middle periods" adhoc and a specific app players may cope it well.
Maube OP has test streams for a reason of "personalized advert/dash event" inserter logic?

const periods = [];
let prevEnd = seekRangeStart;
const periodNodes = TXml.findChildren(mpd, 'Period');
const enablePeriodCaching = context.dynamic && !!(this.config_ && this.config_.dash &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit: I don't see why !! is needed here.

this.periodCache_.get(cacheKey) : null;
let reusePeriod = false;
if (cachedEntry) {
signature = this.computePeriodSignature_(elem);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me why we need both a period cache key and a separate signature.

Let me see if I understand...

We need the key to be cheap, but the signature is more expensive? And we might find a key-match with a different signature, and then we have to treat it as a cache miss?

Are there any cases where we don't have to compute a signature for each and every period? If not, couldn't we move lift the signature computation instead of repeating it in two places?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache key won't change for the period, but it's signature can if its content is modified. So the idea was to be able to quickly find a cached entry then validate if it had any changes by comparing signatures.

if (context.period.id && periodDuration) {
this.periodDurations_.set(context.period.id, periodDuration);
if (enablePeriodCaching && period.id && signature) {
this.periodCache_.set(period.id, {signature, period});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you set into periodCache_ by period.id. Above, you get into periodCache_ by the return of getPeriodCacheKey_(). Why wouldn't you just use getPeriodCacheKey_() in all cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed this one 👍

* @return {number}
* @private
*/
fnv1a_(str, seed) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see these (fnv1a_ and hashTXml_) moved to a utility class. It's distracting here in the DASH parser, and also not testable as a private method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved them into a HashUtils class and added some unit tests 👍

* @private
*/
hashTXml_(node, seed = 0x811c9dc5) {
let hash = this.fnv1a_(node.tagName, seed >>> 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of >>> 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That part was copied from somewhere, I don't know an awful lot about hashing functions so I left that in, but I don't think it makes much difference if we remove it.

@avelad
Copy link
Member

avelad commented Dec 15, 2025

@grabofus can you review the previous comments? Thanks!

@grabofus grabofus marked this pull request as ready for review December 18, 2025 16:56
@grabofus
Copy link
Contributor Author

@avelad addressed the comments - sorry for the delay haven't had much time lately to concentrate on this

Why create a configuration and not always enable it?

Unsure if it benefits everyone, or if it could have side effects in some streaming configurations (e.g. if longer SegmentTimelines are used) so I didn't enable it by default. Happy to change it if you think it should be enabled.

Personally I don't think we should maintain such a cache. Preferably, I think parser should be a bit smarter and work on some assumptions. The overall expectation in live updates should be that either some data has been removed from the beginning of a file, or added at the end of it. These assumptions are already used silently in other parts of code in shaka, i.e. SegmentIndex, PeriodCombiner or MPD Patch implementation. In my opinion live updates logic should leverage based on those assumptions.

Due to my thoughts above, I'm also not sure would we need to calculate hash in a way introduced here, as it's basically DASH agnostic. In doing such live updates we should instead check period characteristics, in particular did any timeline has changed in starting or ending periods.

I was not sure initially if we can rely on middle periods not changing, but this makes perfect sense and is a more performant way to solve this issue.

@grabofus grabofus requested a review from joeyparrish December 18, 2025 17:01
@joeyparrish
Copy link
Member

Personally I don't think we should maintain such a cache. Preferably, I think parser should be a bit smarter and work on some assumptions. The overall expectation in live updates should be that either some data has been removed from the beginning of a file, or added at the end of it. These assumptions are already used silently in other parts of code

I also don't love the hashing & caching, and would prefer something that uses these assumptions.

I was not sure initially if we can rely on middle periods not changing, but this makes perfect sense and is a more performant way to solve this issue.

Are we going in this direction instead? If so, who will do it? @avelad, @grabofus

Thanks!

@avelad avelad modified the milestones: v5.0, v5.1 Feb 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component: DASH The issue involves the MPEG DASH manifest format priority: P3 Useful but not urgent type: enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants