I suppose you could turn off the sound of the film and just read the subtitles.  But if you are going to do that, you might do much better, in terms of accessibility, to see if the script of the movie is available.  There are some sites that are completely useless such as Scriptorama, (spelling) because the person goes through all the trouble of transcribing script after script, and doesn't include any information such as who is speaking and the setting.  There are other sites that have full scripts, but you will get earlier drafts on some sites, not the final scripts and there may be important differences.  So be careful what the site tells you about the script.
Actually, as I think about it, it might be cumbersome, but using a script from Scriptorama, maybe you could use it to find passages that are unintelligible in the film and read them while stopping the film, then returning to the film.

        I get what you're saying, but just imagine what it would be like to have the dialog, the background noise that's part of the scene, and synthesized subtitles all being churned out at the same time.

        I understand what you're trying to solve, but I don't think that adding "a third layer" that's also presented auditorily will actually do that.  I guess it can't hurt to try, if it's possible, but I suspect a "making it worse, not better" outcome.

