Loading static files from disk
Authorizing and Mapping Urls and Domains
By default PageSpeed loads sub-resources via an HTTP fetch. It would be faster to load sub-resources directly from the filesystem, however this may not be safe to do because the sub-resources may be dynamically generated or the sub-resources may not be stored on the same server.
However, you can explicitly tell PageSpeed to load static sub-resources from disk by using the LoadFromFile
directive. For example:
pagespeed LoadFromFile "http://www.example.com/static/" "c:\www\static/"
tells PageSpeed to load all resources whose URLs start with http://www.example.com/static/
from the filesystem under c:\www\static/
. For example, http://www.example.com/static/images/foo.png
will be loaded from the file c:\www\static/images/foo.png
. However, http://www.example.com/bar.jpg
will still be fetched using HTTP.
If you need more sophisticated prefix-matching behavior, you can use the LoadFromFileMatch
directive, which supports RE2-formatted regular expressions. (Note that this is not the same format as the wildcards used above and elsewhere in PageSpeed.) For example:
pagespeed LoadFromFileMatch "^https?://example.com/~([^/]*)/static/" "c:\www\static/\\1"
Will load http://example.com/~pat/static/cat.jpg
from c:\www\static/pat/cat.jpg
, http://example.com/~sam/static/images/dog.jpg
from c:\www\static/sam/images/dog.jpg
, and https://example.com/~al/static/css/ie
from c:\www\static/al/css/ie
. The resource http://example.com/~pat/images/static/puppy.gif
, however, would not be matched by this directive and would be fetched using HTTP.
Because PageSpeed is loading the files directly from the filesystem, no custom headers will be set.
You can also use the LoadFromFile
directive to load HTTPS resources which would not be otherwise fetchable directly. For example:
pagespeed LoadFromFile "https://www.example.com/static/" "c:\www\static/";
The filesystem path must be an absolute path.
You can specify multiple LoadFromFile
associations in configuration files. Note that large numbers of such directives may impact performance.
If the sub-resource cannot be loaded from file in the directory specified, the sub-request will fail (rather than fall back to HTTP fetch). Part of the reason for this is to indicate a configuration error more clearly.
As an added benefit. If resources are loaded from file, the rewritten versions will be updated immediately when you change the associated file. Resources loaded via normal HTTP fetches are refreshed only when they expire from the cache (by default every 5 minutes). Therefore, the rewritten versions are only updated as often as the cache is refreshed. Resources loaded from file are not subject to caching behavior because they are accessed directly from the filesystem for every request for the rewritten version.
See also MapOriginDomain
.
This directive can not be use in location-specific configuration sections.
Limiting Direct Loading
A mapping set up with LoadFromFile
allows filesystem loading for anything it matches. If you have directories or file types that cannot be loaded directly from the filesystem, LoadFromFileRule
lets you add fine-grained rules to control which files will be loaded directly and which will fall back to the standard process, over HTTP.
When given a URL PageSpeed first determines whether any LoadFromFile mappings apply. If one does, it calculates the mapped filename and checks for applicable LoadFromFileRules. Considering rules in the reverse order of definition, it takes the first applicable one and uses that to determine whether to load from file or fall back to HTTP.
Some examples may be helpful. Consider a website that is entirely static content except for a /cgi-bin
directory:
c:\www\index.html
c:\www\css\style.css
c:\www\gfx\image.png
c:\www\bin\webapp.dll
While most of the site can be loaded directly from the filesystem, webapp.dll
and web.config
are files that need to be interpreted before serving -- or not served at all! Adding a rule disallowing the /bin
directory tells us to fall back to HTTP appropriately:
pagespeed LoadFromFile http://example.com/ c:\www\
pagespeed LoadFromFileRule Disallow c:\www\bin
The LoadFromFileRule
directive takes two arguments. The first must be either Allow
or Disallow
while the second is a prefix that specifies which filesystem paths it should apply to. Because the default is to allow loading from the filesystem for all paths listed in any LoadFromFile
statement, most of the time you will be using Disallow
to turn off filesystem loading for some subset of those paths. You would use Allow
only after a Disallow
that was overly general.
Not all sites are well suited for prefix-based control. Consider a site with aspx files mixed in with ordinary static files:
c:\www\index.html
c:\www\webmail.aspx
c:\www\webmail.css
c:\www\blog/index.aspx
c:\www\blog/header.png
c:\www\blog/blog.css
Blacklisting just the .aspx
files so they fall back to an HTTP fetch allows everything else to be loaded directly from the filesystem:
pagespeed LoadFromFile http://example.com/ c:\www\;
pagespeed LoadFromFileRuleMatch Disallow \.aspx;
The LoadFromFileRuleMatch
directive also takes two arguments. The first is either Allow
or Disallow
and functions just like for LoadFromFileRule
above. The second argument, however, is a RE2-format regular expression instead of a file prefix. Remember to escape characters that have special meaning in regular expressions. For example, if instead of \.aspx$
we had simply .aspx$
then a file named example.notphp
would still be forced to load over HTTP because ".
" is special syntax for "match any single character".
Consider a site with the opposite problem: a few file types can be reliably loaded from file but the rest need interpretation first. For example:
c:\www\index.html
c:\www\site.css
c:\www\script-using-ssi.js
c:\www\generate-image.ashx
c:\www\
In this site generate-image.ashx
needs to be interpreted to make images. The only resources on the site that are generally safe to load are .css
ones. By first blacklisting everything and then whitelisting only the .css
files, we can make PageSpeed do this:
pagespeed LoadFromFile http://example.com/ c:\www\
pagespeed LoadFromFileRuleMatch disallow .*
pagespeed LoadFromFileRuleMatch allow \.css$
This works because order is significant: later rules take precedence over earlier ones.