[TOC]

程序的运行具有局部性特征:
    时间局部性:一个数据被访问过之后,可能很快会被再次访问
    空间局部性:一个数据被访问时,其周边的数据也有可能被访问到
    
cache:命中 
    
    热区:局部性;
        时效性:
            缓存空间耗尽:LRU
            过期:缓存清理
            
缓存命中率:hit/(hit+miss)
    (0,1)
    页面命中率:基于页面数量进行衡量
    字节命中率:基于页面的体积进行衡量
    
缓存与否:
    私有数据:private,private cache;
    公共数据:public, public or private cache;

Cache-related Headers Fields
    The most important caching header fields are:

        Expires:过期时间;
            Expires:Thu, 22 Oct 2026 06:34:30 GMT
        Cache-Control
        
        Etag
        If-None-Match
        
        Last-Modified
        If-Modified-Since
        
        Vary
        Age

    缓存有效性判断机制:
        过期时间:Expires
            HTTP/1.0
                Expires
            HTTP/1.1
                Cache-Control: maxage=
                Cache-Control: s-maxage=
        条件式请求:
            Last-Modified/If-Modified-Since
            Etag/If-None-Match 
            
        Expires:Thu, 13 Aug 2026 02:05:12 GMT
        Cache-Control:max-age=315360000
        ETag:"1ec5-502264e2ae4c0"
        Last-Modified:Wed, 03 Sep 2014 10:00:27 GMT
        
    缓存层级:
        私有缓存:用户代理附带的本地缓存机制;
        公共缓存:反向代理服务器的缓存功能;
        
        User-Agent <--> private cache <--> public cache <--> public cache 2 <--> Original Server

请求报文用于通知缓存服务如何使用缓存响应请求:
    cache-request-directive = 
        "no-cache",                        
        | "no-store"                         
        | "max-age" "=" delta-seconds        
        | "max-stale" [ "=" delta-seconds ]  
        | "min-fresh" "=" delta-seconds      
        | "no-transform"                    
        | "only-if-cached"                  
        | cache-extension                    

响应报文用于通知缓存服务器如何存储上级服务器响应的内容:
    cache-response-directive =
        "public"                               
        | "private" [ "=" <"> 1#field-name <"> ] 
        | "no-cache" [ "=" <"> 1#field-name <"> ],可缓存,但响应给客户端之前需要revalidation;
        | "no-store" ,不允许存储响应内容于缓存中;                           
        | "no-transform"                        
        | "must-revalidate"                     
        | "proxy-revalidate"                  
        | "max-age" "=" delta-seconds           
        | "s-maxage" "=" delta-seconds          
        | cache-extension     
        
开源解决方案:
    squid:
    varnish:
        
    varnish官方站点: http://www.varnish-cache.org/
        Community
        Enterprise
        
         This is Varnish Cache, a high-performance HTTP accelerator. 
        
    程序架构:
        Manager进程
        Cacher进程,包含多种类型的线程:
            accept, worker, expiry, ... 
        shared memory log:
            统计数据:计数器;
            日志区域:日志记录;
                varnishlog, varnishncsa, varnishstat... 
            
        配置接口:VCL
            Varnish Configuration Language, 
                vcl complier --> c complier --> shared object 

                
    varnish的程序环境:
        /etc/varnish/varnish.params: 配置varnish服务进程的工作特性,例如监听的地址和端口,缓存机制;
        /etc/varnish/default.vcl:配置各Child/Cache线程的缓存工作属性;
        主程序:
            /usr/sbin/varnishd
        CLI interface:
            /usr/bin/varnishadm
        Shared Memory Log交互工具:
            /usr/bin/varnishhist
            /usr/bin/varnishlog
            /usr/bin/varnishncsa
            /usr/bin/varnishstat
            /usr/bin/varnishtop     
        测试工具程序:
            /usr/bin/varnishtest
        VCL配置文件重载程序:
            /usr/sbin/varnish_reload_vcl
        Systemd Unit File:
            /usr/lib/systemd/system/varnish.service
                varnish服务
            /usr/lib/systemd/system/varnishlog.service
            /usr/lib/systemd/system/varnishncsa.service 
                日志持久的服务;
                
    varnish的缓存存储机制( Storage Types):
        -s [name=]type[,options]
        
        ・ malloc[,size]
            内存存储,[,size]用于定义空间大小;重启后所有缓存项失效;
        ・ file[,path[,size[,granularity]]]
            文件存储,黑盒;重启后所有缓存项失效;
        ・ persistent,path,size
            文件存储,黑盒;重启后所有缓存项有效;实验;
            
    varnish程序的选项:
        程序选项:/etc/varnish/varnish.params文件
            -a address[:port][,address[:port][...],默认为6081端口; 
            -T address[:port],默认为6082端口;
            -s [name=]type[,options],定义缓存存储机制;
            -u user
            -g group
            -f config:VCL配置文件;
            -F:运行于前台;
            ...
        运行时参数:/etc/varnish/varnish.params文件, DEAMON_OPTS
            DAEMON_OPTS="-p thread_pool_min=5 -p thread_pool_max=500 -p thread_pool_timeout=300"
            
            -p param=value:设定运行参数及其值; 可重复使用多次;
            -r param[,param...]: 设定指定的参数为只读状态; 
            
    重载vcl配置文件:
        ~ ]# varnish_reload_vcl
            
    varnishadm
        -S /etc/varnish/secret -T [ADDRESS:]PORT 
  
        help [<command>]
        ping [<timestamp>]
        auth <response>
        quit
        banner
        status
        start
        stop
        vcl.load <configname> <filename>
        vcl.inline <configname> <quoted_VCLstring>
        vcl.use <configname>
        vcl.discard <configname>
        vcl.list
        param.show [-l] [<param>]
        param.set <param> <value>
        panic.show
        panic.clear
        storage.list
        vcl.show [-v] <configname>
        backend.list [<backend_expression>]
        backend.set_health <backend_expression> <state>
        ban <field> <operator> <arg> [&& <field> <oper> <arg>]...
        ban.list    
        
        配置文件相关:
            vcl.list 
            vcl.load:装载,加载并编译;
            vcl.use:激活;
            vcl.discard:删除;
            vcl.show [-v] <configname>:查看指定的配置文件的详细信息;
            
        运行时参数:
            param.show -l:显示列表;
            param.show <PARAM>
            param.set <PARAM> <VALUE>
            
        缓存存储:
            storage.list
            
        后端服务器:
            backend.list 
            
    VCL:
        ”域“专有类型的配置语言;
        
        state engine:状态引擎;
        
        VCL有多个状态引擎,状态之间存在相关性,但状态引擎彼此间互相隔离;每个状态引擎可使用return(x)指明关联至哪个下一级引擎;每个状态引擎对应于vcl文件中的一个配置段,即为subroutine
        
            vcl_hash --> return(hit) --> vcl_hit
            
        Client Side:
            vcl_recv, vcl_pass, vcl_hit, vcl_miss, vcl_pipe, vcl_purge, vcl_synth, vcl_deliver
            
            vcl_recv:
                hash:vcl_hash
                pass: vcl_pass 
                pipe: vcl_pipe
                synth: vcl_synth
                purge: vcl_hash --> vcl_purge
                
            vcl_hash:
                lookup:
                    hit: vcl_hit
                    miss: vcl_miss
                    pass, hit_for_pass: vcl_pass
                    purge: vcl_purge
            
        Backend Side:
            vcl_backend_fetch, vcl_backend_response, vcl_backend_error
    
        两个特殊的引擎:
            vcl_init:在处理任何请求之前要执行的vcl代码:主要用于初始化VMODs;
            vcl_fini:所有的请求都已经结束,在vcl配置被丢弃时调用;主要用于清理VMODs;
        
    vcl的语法格式:
        (1) VCL files start with vcl 4.0;
        (2) //, # and /* foo */ for comments;
        (3) Subroutines are declared with the sub keyword; 例如sub vcl_recv { ...};
        (4) No loops, state-limited variables(受限于引擎的内建变量);
        (5) Terminating statements with a keyword for next action as argument of the return() function, i.e.: return(action);用于实现状态引擎转换; 
        (6) Domain-specific;
        
    The VCL Finite State Machine
        (1) Each request is processed separately;
        (2) Each request is independent from others at any given time;
        (3) States are related, but isolated;
        (4) return(action); exits one state and instructs Varnish to proceed to the next state;
        (5) Built-in VCL code is always present and appended below your own VCL;
        
    三类主要语法:
        sub subroutine {
            ...
        }
        
        if CONDITION {
            ...
        } else {    
            ...
        }
        
        return(), hash_data()
        
    VCL Built-in Functions and Keywords
        函数:
            regsub(str, regex, sub)
            regsuball(str, regex, sub)
            ban(boolean expression)
            hash_data(input)
            synthetic(str)
            
        Keywords:
            call subroutine, return(action),new,set,unset 
            
        操作符:
            ==, !=, ~, >, >=, <, <=
            逻辑操作符:&&, ||, !
            变量赋值:=
            
        举例:obj.hits
            if (obj.hits>0) {
                set resp.http.X-Cache = "HIT via " + server.ip;
            } else {
                set resp.http.X-Cache = "MISS via " + server.ip;
            }
                    
    
    变量类型:
        内建变量:
            req.*:request,表示由客户端发来的请求报文相关;
                req.http.*
                    req.http.User-Agent, req.http.Referer, ...
            bereq.*:由varnish发往BE主机的httpd请求相关;
                bereq.http.*
            beresp.*:由BE主机响应给varnish的响应报文相关;
                beresp.http.*
            resp.*:由varnish响应给client相关;
            obj.*:存储在缓存空间中的缓存对象的属性;只读;
            
            常用变量:
                bereq.*, req.*:
                    bereq.http.HEADERS
                    bereq.request:请求方法;
                    bereq.url:请求的url;
                    bereq.proto:请求的协议版本;
                    bereq.backend:指明要调用的后端主机;
                    
                    req.http.Cookie:客户端的请求报文中Cookie首部的值; 
                    req.http.User-Agent ~ "chrome"
                    
                    
                beresp.*, resp.*:
                    beresp.http.HEADERS
                    beresp.status:响应的状态码;
                    reresp.proto:协议版本;
                    beresp.backend.name:BE主机的主机名;
                    beresp.ttl:BE主机响应的内容的余下的可缓存时长;
                    
                obj.*
                    obj.hits:此对象从缓存中命中的次数;
                    obj.ttl:对象的ttl值
                    
                server.*
                    server.ip
                    server.hostname
                client.*
                    client.ip                   
            
        用户自定义:
            set 
            unset 
        
    示例1:强制对某类资源的请求不检查缓存:
        vcl_recv {
            if (req.url ~ "(?i)^/(login|admin)") {
                return(pass);
            }
        }
            
    示例2:对于特定类型的资源,例如公开的图片等,取消其私有标识,并强行设定其可以由varnish缓存的时长; 
        if (beresp.http.cache-control !~ "s-maxage") {
            if (bereq.url ~ "(?i)\.(jpg|jpeg|png|gif|css|js)$") {
                unset beresp.http.Set-Cookie;
                set beresp.ttl = 3600s;
            }
        }
            
    缓存对象的修剪:purge, ban 
        (1) 能执行purge操作
            sub vcl_purge {
                return (synth(200,"Purged"));
            }
            
        (2) 何时执行purge操作
            sub vcl_recv {
                if (req.method == "PURGE") {
                    return(purge);
                }
                ...
            }
            
        添加此类请求的访问控制法则:
            acl purgers {
                "127.0.0.0"/8;
                "10.1.0.0"/16;
            }
            
            sub vcl_recv {
                if (req.method == "PURGE") {
                    if (!client.ip ~ purgers) {
                        return(synth(405,"Purging not allowed for " + client.ip));
                    }
                    return(purge);
                }
                ...
            }
            
    如何设定使用多个后端主机:
        backend default {
            .host = "172.16.100.6";
            .port = "80";
        }

        backend appsrv {
            .host = "172.16.100.7";
            .port = "80";
        }
        
        sub vcl_recv {              
            if (req.url ~ "(?i)\.php$") {
                set req.backend_hint = appsrv;
            } else {
                set req.backend_hint = default;
            }   
            
            ...
        }
        
    Director:
        varnish module; 
            使用前需要导入:
                import director;
        
        示例:
            import directors;    # load the directors

            backend server1 {
                .host = 
                .port = 
            }
            backend server2 {
                .host = 
                .port = 
            }

            sub vcl_init {
                new GROUP_NAME = directors.round_robin();
                GROUP_NAME.add_backend(server1);
                GROUP_NAME.add_backend(server2);
            }

            sub vcl_recv {
                # send all traffic to the bar director:
                set req.backend_hint = GROUP_NAME.backend();
            }
        
    BE Health Check:
        backend BE_NAME {
            .host =  
            .port = 
            .probe = {
                .url= 
                .timeout= 
                .interval= 
                .window=
                .threshhold=
            }
        }
        
        .probe:定义健康状态检测方法;
            .url:检测时请求的URL,默认为”/"; 
            .request:发出的具体请求;
                .request = 
                    "GET /.healthtest.html HTTP/1.1"
                    "Host: www.magedu.com"
                    "Connection: close"
            .window:基于最近的多少次检查来判断其健康状态; 
            .threshhold:最近.window中定义的这么次检查中至有.threshhold定义的次数是成功的;
            .interval:检测频度; 
            .timeout:超时时长;
            .expected_response:期望的响应码,默认为200;
            
        健康状态检测的配置方式:
            (1) probe PB_NAME = { }
                 backend NAME = {
                .probe = PB_NAME;
                ...
                 }
                 
            (2) backend NAME  {
                .probe = {
                    ...
                }
            }

        示例:
            probe check {
                .url = "/.healthcheck.html";
                .window = 5;
                .threshold = 4;
                .interval = 2s;
                .timeout = 1s;
            }

            backend default {
                .host = "10.1.0.68";
                .port = "80";
                .probe = check;
            }

            backend appsrv {
                .host = "10.1.0.69";
                .port = "80";
                .probe = check;
            }               
            
            
     varnish的运行时参数:
        线程模型:
            cache-worker
            cache-main
            ban lurker
            acceptor:
            epoll/kqueue:
            ...
            
        线程相关的参数:
            在线程池内部,其每一个请求由一个线程来处理; 其worker线程的最大数决定了varnish的并发响应能力;
            
            thread_pools:Number of worker thread pools. 最好小于或等于CPU核心数量; 
            thread_pool_max:The maximum number of worker threads in each pool. 每线程池的最大线程数;
            thread_pool_min:The minimum number of worker threads in each pool. 额外意义为“最大空闲线程数”;
            
                最大并发连接数=thread_pools  * thread_pool_max
                
            thread_pool_timeout:Thread idle threshold.  Threads in excess of thread_pool_min, which have been idle for at least this long, will be destroyed.
            thread_pool_add_delay:Wait at least this long after creating a thread.
            thread_pool_destroy_delay:Wait this long after destroying a thread.
            
            设置方式:
                vcl.param 
                param.set
            
            永久有效的方法:
                varnish.params
                    DEAMON_OPTS="-p PARAM1=VALUE -p PARAM2=VALUE"
                    
    varnish日志区域:
        shared memory log 
            计数器
            日志信息
            
        1、varnishstat - Varnish Cache statistics
            -1
            -1 -f FILED_NAME 
            -l:可用于-f选项指定的字段名称列表;
            
            MAIN.cache_hit 
            MAIN.cache_miss
            
            # varnishstat -1 -f MAIN.cache_hit -f MAIN.cache_miss
            
        2、varnishtop - Varnish log entry ranking
            -1     Instead of a continously updated display, print the statistics once and exit.
            -i taglist,可以同时使用多个-i选项,也可以一个选项跟上多个标签;
            -I <[taglist:]regex>
            -x taglist:排除列表
            -X  <[taglist:]regex>
            
        3、varnishlog - Display Varnish logs
            
        4、 varnishncsa - Display Varnish logs in Apache / NCSA combined log format
        
    内建函数:
        hash_data():指明哈希计算的数据;减少差异,以提升命中率;
        regsub(str,regex,sub):把str中被regex第一次匹配到字符串替换为sub;主要用于URL Rewrite
        regsuball(str,regex,sub):把str中被regex每一次匹配到字符串均替换为sub;
        return():
        ban(expression) 
        ban_url(regex):Bans所有的其URL可以被此处的regex匹配到的缓存对象;
        synth(status,"STRING"):purge操作;
        
            
总结:
    varnish: state engine, vcl 
        varnish 4.0:
            vcl_init 
            vcl_recv
            vcl_hash 
            vcl_hit 
            vcl_pass
            vcl_miss 
            vcl_pipe 
            vcl_waiting
            vcl_purge 
            vcl_purge 
            vcl_deliver
            vcl_synth
            vcl_fini
            
            vcl_backend_fetch
            vcl_backend_response
            vcl_backend_error 
            
        sub VCL_STATE_ENGINE 
        backend BE_NAME {} 
        probe PB_NAME {}
        acl ACL_NAME {}
        
博客作业:以上所有内容;
课外实践:(1) zabbix监控varnish业务指标;
          (2) ansible实现varnish快速部署; 
          (3) 两个lamp部署wordpress,用Nginx反代,做压测;nginx后部署varnish缓存,调整vcl,多次压测;
          
    ab, http_load, webbench, seige, jmeter, loadrunner,...