learn.lianglianglee.com/专栏/ElasticSearch知识体系详解/12 聚合:聚合查询之Pipline聚合详解.md.html
2022-08-14 03:40:33 +08:00

427 lines
18 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<!-- saved from url=(0046)https://kaiiiz.github.io/hexo-theme-book-demo/ -->
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1.0, user-scalable=no">
<link rel="icon" href="/static/favicon.png">
<title>12 聚合聚合查询之Pipline聚合详解.md.html</title>
<!-- Spectre.css framework -->
<link rel="stylesheet" href="/static/index.css">
<!-- theme css & js -->
<meta name="generator" content="Hexo 4.2.0">
</head>
<body>
<div class="book-container">
<div class="book-sidebar">
<div class="book-brand">
<a href="/">
<img src="/static/favicon.png">
<span>技术文章摘抄</span>
</a>
</div>
<div class="book-menu uncollapsible">
<ul class="uncollapsible">
<li><a href="/" class="current-tab">首页</a></li>
</ul>
<ul class="uncollapsible">
<li><a href="../">上一级</a></li>
</ul>
<ul class="uncollapsible">
<li>
<a href="/专栏/ElasticSearch知识体系详解/01 认知ElasticSearch基础概念.md.html">01 认知ElasticSearch基础概念</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/02 认知Elastic Stack生态和场景方案.md.html">02 认知Elastic Stack生态和场景方案</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/03 安装ElasticSearch和Kibana安装.md.html">03 安装ElasticSearch和Kibana安装</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/04 入门:查询和聚合的基础使用.md.html">04 入门:查询和聚合的基础使用</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/05 索引:索引管理详解.md.html">05 索引:索引管理详解</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/06 索引:索引模板(Index Template)详解.md.html">06 索引:索引模板(Index Template)详解</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/07 查询DSL查询之复合查询详解.md.html">07 查询DSL查询之复合查询详解</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/08 查询DSL查询之全文搜索详解.md.html">08 查询DSL查询之全文搜索详解</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/09 查询DSL查询之Term详解.md.html">09 查询DSL查询之Term详解</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/10 聚合聚合查询之Bucket聚合详解.md.html">10 聚合聚合查询之Bucket聚合详解</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/11 聚合聚合查询之Metric聚合详解.md.html">11 聚合聚合查询之Metric聚合详解</a>
</li>
<li>
<a class="current-tab" href="/专栏/ElasticSearch知识体系详解/12 聚合聚合查询之Pipline聚合详解.md.html">12 聚合聚合查询之Pipline聚合详解</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/13 原理从图解构筑对ES原理的初步认知.md.html">13 原理从图解构筑对ES原理的初步认知</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/14 原理ES原理知识点补充和整体结构.md.html">14 原理ES原理知识点补充和整体结构</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/15 原理ES原理之索引文档流程详解.md.html">15 原理ES原理之索引文档流程详解</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/16 原理ES原理之读取文档流程详解.md.html">16 原理ES原理之读取文档流程详解</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/17 优化ElasticSearch性能优化详解.md.html">17 优化ElasticSearch性能优化详解</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/18 大厂实践:腾讯万亿级 Elasticsearch 技术实践.md.html">18 大厂实践:腾讯万亿级 Elasticsearch 技术实践</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/19 资料Awesome Elasticsearch.md.html">19 资料Awesome Elasticsearch</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/20 WrapperQuery.md.html">20 WrapperQuery</a>
</li>
<li>
<a href="/专栏/ElasticSearch知识体系详解/21 备份和迁移.md.html">21 备份和迁移</a>
</li>
</ul>
</div>
</div>
<div class="sidebar-toggle" onclick="sidebar_toggle()" onmouseover="add_inner()" onmouseleave="remove_inner()">
<div class="sidebar-toggle-inner"></div>
</div>
<script>
function add_inner() {
let inner = document.querySelector('.sidebar-toggle-inner')
inner.classList.add('show')
}
function remove_inner() {
let inner = document.querySelector('.sidebar-toggle-inner')
inner.classList.remove('show')
}
function sidebar_toggle() {
let sidebar_toggle = document.querySelector('.sidebar-toggle')
let sidebar = document.querySelector('.book-sidebar')
let content = document.querySelector('.off-canvas-content')
if (sidebar_toggle.classList.contains('extend')) { // show
sidebar_toggle.classList.remove('extend')
sidebar.classList.remove('hide')
content.classList.remove('extend')
} else { // hide
sidebar_toggle.classList.add('extend')
sidebar.classList.add('hide')
content.classList.add('extend')
}
}
function open_sidebar() {
let sidebar = document.querySelector('.book-sidebar')
let overlay = document.querySelector('.off-canvas-overlay')
sidebar.classList.add('show')
overlay.classList.add('show')
}
function hide_canvas() {
let sidebar = document.querySelector('.book-sidebar')
let overlay = document.querySelector('.off-canvas-overlay')
sidebar.classList.remove('show')
overlay.classList.remove('show')
}
</script>
<div class="off-canvas-content">
<div class="columns">
<div class="column col-12 col-lg-12">
<div class="book-navbar">
<!-- For Responsive Layout -->
<header class="navbar">
<section class="navbar-section">
<a onclick="open_sidebar()">
<i class="icon icon-menu"></i>
</a>
</section>
</header>
</div>
<div class="book-content" style="max-width: 960px; margin: 0 auto;
overflow-x: auto;
overflow-y: hidden;">
<div class="book-post">
<p id="tip" align="center"></p>
<div><h1>12 聚合聚合查询之Pipline聚合详解</h1>
<h2>如何理解pipeline聚合</h2>
<blockquote>
<p>如何理解管道聚合呢?最重要的是要站在设计者角度看这个功能的要实现的目的:让上一步的聚合结果成为下一个聚合的输入,这就是管道。</p>
</blockquote>
<h3>管道机制的常见场景</h3>
<blockquote>
<p>首先回顾下Tomcat管道机制中向你介绍的常见的管道机制设计中的应用场景。</p>
</blockquote>
<h4>责任链模式</h4>
<p>管道机制在设计模式上属于责任链模式,如果你不理解,请参看如下文章:</p>
<p>责任链模式: 通过责任链模式, 你可以为某个请求创建一个对象链. 每个对象依序检查此请求并对其进行处理或者将它传给链中的下一个对象。</p>
<h4>FilterChain</h4>
<p>在软件开发的常接触的责任链模式是FilterChain它体现在很多软件设计中</p>
<ul>
<li><strong>比如Spring Security框架中</strong></li>
</ul>
<p><img src="assets/tomcat-x-pipline-6.jpg" alt="img" /></p>
<ul>
<li><strong>比如HttpServletRequest处理的过滤器中</strong></li>
</ul>
<p>当一个request过来的时候需要对这个request做一系列的加工使用责任链模式可以使每个加工组件化减少耦合。也可以使用在当一个request过来的时候需要找到合适的加工方式。当一个加工方式不适合这个request的时候传递到下一个加工方法该加工方式再尝试对request加工。</p>
<p>网上找了图这里我们后文将通过Tomcat请求处理向你阐述。</p>
<p><img src="assets/tomcat-x-pipline-5.jpg" alt="img" /></p>
<h3>ElasticSearch设计管道机制</h3>
<p>简单而言:让上一步的聚合结果成为下一个聚合的输入,这就是管道。</p>
<p>接下来,无非就是对不同类型的聚合有接口的支撑,比如:</p>
<p><img src="assets/es-agg-pipeline-1.png" alt="img" /></p>
<blockquote>
<p>第一个维度:管道聚合有很多不同<strong>类型</strong>,每种类型都与其他聚合计算不同的信息,但是可以将这些类型分为两类:</p>
</blockquote>
<ul>
<li><strong>父级</strong> 父级聚合的输出提供了一组管道聚合,它可以计算新的存储桶或新的聚合以添加到现有存储桶中。</li>
<li><strong>兄弟</strong> 同级聚合的输出提供的管道聚合,并且能够计算与该同级聚合处于同一级别的新聚合。</li>
</ul>
<blockquote>
<p>第二个维度:根据<strong>功能设计</strong>的意图</p>
</blockquote>
<p>比如前置聚合可能是Bucket聚合后置的可能是基于Metric聚合那么它就可以成为一类管道</p>
<p>进而引出了:<code>xxx bucket</code>(是不是很容易理解了 @pdai)</p>
<ul>
<li>
<p>Bucket聚合 -&gt; Metric聚合</p>
<p> bucket聚合的结果成为下一步metric聚合的输入</p>
<ul>
<li>Average bucket</li>
<li>Min bucket</li>
<li>Max bucket</li>
<li>Sum bucket</li>
<li>Stats bucket</li>
<li>Extended stats bucket</li>
</ul>
</li>
</ul>
<p>对构建体系而言,理解上面的已经够了,其它的类型不过是锦上添花而言。</p>
<h2>一些例子</h2>
<blockquote>
<p>这里我们通过几个简单的例子看看即可,具体如果需要使用看看文档即可。@pdai</p>
</blockquote>
<h3>Average bucket 聚合</h3>
<pre><code class="language-bash">POST _search
{
&quot;size&quot;: 0,
&quot;aggs&quot;: {
&quot;sales_per_month&quot;: {
&quot;date_histogram&quot;: {
&quot;field&quot;: &quot;date&quot;,
&quot;calendar_interval&quot;: &quot;month&quot;
},
&quot;aggs&quot;: {
&quot;sales&quot;: {
&quot;sum&quot;: {
&quot;field&quot;: &quot;price&quot;
}
}
}
},
&quot;avg_monthly_sales&quot;: {
// tag::avg-bucket-agg-syntax[]
&quot;avg_bucket&quot;: {
&quot;buckets_path&quot;: &quot;sales_per_month&gt;sales&quot;,
&quot;gap_policy&quot;: &quot;skip&quot;,
&quot;format&quot;: &quot;#,##0.00;(#,##0.00)&quot;
}
// end::avg-bucket-agg-syntax[]
}
}
}
</code></pre>
<ul>
<li>嵌套的bucket聚合聚合出按月价格的直方图</li>
<li>Metic聚合对上面的聚合再求平均值。</li>
</ul>
<p><strong>字段类型</strong></p>
<ul>
<li>buckets_path指定聚合的名称支持多级嵌套聚合。</li>
<li>gap_policy 当管道聚合遇到不存在的值有点类似于term等聚合的(missing)时所采取的策略可选择值为skip、insert_zeros。</li>
<li>skip此选项将丢失的数据视为bucket不存在。它将跳过桶并使用下一个可用值继续计算。</li>
<li>format 用于格式化聚合桶的输出(key)。</li>
</ul>
<p>输出结果如下</p>
<pre><code class="language-json">{
&quot;took&quot;: 11,
&quot;timed_out&quot;: false,
&quot;_shards&quot;: ...,
&quot;hits&quot;: ...,
&quot;aggregations&quot;: {
&quot;sales_per_month&quot;: {
&quot;buckets&quot;: [
{
&quot;key_as_string&quot;: &quot;2015/01/01 00:00:00&quot;,
&quot;key&quot;: 1420070400000,
&quot;doc_count&quot;: 3,
&quot;sales&quot;: {
&quot;value&quot;: 550.0
}
},
{
&quot;key_as_string&quot;: &quot;2015/02/01 00:00:00&quot;,
&quot;key&quot;: 1422748800000,
&quot;doc_count&quot;: 2,
&quot;sales&quot;: {
&quot;value&quot;: 60.0
}
},
{
&quot;key_as_string&quot;: &quot;2015/03/01 00:00:00&quot;,
&quot;key&quot;: 1425168000000,
&quot;doc_count&quot;: 2,
&quot;sales&quot;: {
&quot;value&quot;: 375.0
}
}
]
},
&quot;avg_monthly_sales&quot;: {
&quot;value&quot;: 328.33333333333333,
&quot;value_as_string&quot;: &quot;328.33&quot;
}
}
}
</code></pre>
<h3>Stats bucket 聚合</h3>
<p>进一步的stat bucket也很容易理解了</p>
<pre><code class="language-bash">POST /sales/_search
{
&quot;size&quot;: 0,
&quot;aggs&quot;: {
&quot;sales_per_month&quot;: {
&quot;date_histogram&quot;: {
&quot;field&quot;: &quot;date&quot;,
&quot;calendar_interval&quot;: &quot;month&quot;
},
&quot;aggs&quot;: {
&quot;sales&quot;: {
&quot;sum&quot;: {
&quot;field&quot;: &quot;price&quot;
}
}
}
},
&quot;stats_monthly_sales&quot;: {
&quot;stats_bucket&quot;: {
&quot;buckets_path&quot;: &quot;sales_per_month&gt;sales&quot;
}
}
}
}
</code></pre>
<p>返回</p>
<pre><code class="language-bash">{
&quot;took&quot;: 11,
&quot;timed_out&quot;: false,
&quot;_shards&quot;: ...,
&quot;hits&quot;: ...,
&quot;aggregations&quot;: {
&quot;sales_per_month&quot;: {
&quot;buckets&quot;: [
{
&quot;key_as_string&quot;: &quot;2015/01/01 00:00:00&quot;,
&quot;key&quot;: 1420070400000,
&quot;doc_count&quot;: 3,
&quot;sales&quot;: {
&quot;value&quot;: 550.0
}
},
{
&quot;key_as_string&quot;: &quot;2015/02/01 00:00:00&quot;,
&quot;key&quot;: 1422748800000,
&quot;doc_count&quot;: 2,
&quot;sales&quot;: {
&quot;value&quot;: 60.0
}
},
{
&quot;key_as_string&quot;: &quot;2015/03/01 00:00:00&quot;,
&quot;key&quot;: 1425168000000,
&quot;doc_count&quot;: 2,
&quot;sales&quot;: {
&quot;value&quot;: 375.0
}
}
]
},
&quot;stats_monthly_sales&quot;: {
&quot;count&quot;: 3,
&quot;min&quot;: 60.0,
&quot;max&quot;: 550.0,
&quot;avg&quot;: 328.3333333333333,
&quot;sum&quot;: 985.0
}
}
}
</code></pre>
<h2>参考文章</h2>
<p>https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html</p>
</div>
</div>
<div>
<div style="float: left">
<a href="/专栏/ElasticSearch知识体系详解/11 聚合聚合查询之Metric聚合详解.md.html">上一页</a>
</div>
<div style="float: right">
<a href="/专栏/ElasticSearch知识体系详解/13 原理从图解构筑对ES原理的初步认知.md.html">下一页</a>
</div>
</div>
</div>
</div>
</div>
</div>
<a class="off-canvas-overlay" onclick="hide_canvas()"></a>
</div>
<script defer src="https://static.cloudflareinsights.com/beacon.min.js/v652eace1692a40cfa3763df669d7439c1639079717194" integrity="sha512-Gi7xpJR8tSkrpF7aordPZQlW2DLtzUlZcumS8dMQjwDHEnw9I7ZLyiOj/6tZStRBGtGgN6ceN6cMH8z7etPGlw==" data-cf-beacon='{"rayId":"70996fa5bef13d60","version":"2021.12.0","r":1,"token":"1f5d475227ce4f0089a7cff1ab17c0f5","si":100}' crossorigin="anonymous"></script>
</body>
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-NPSEEVD756"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag() {
dataLayer.push(arguments);
}
gtag('js', new Date());
gtag('config', 'G-NPSEEVD756');
var path = window.location.pathname
var cookie = getCookie("lastPath");
console.log(path)
if (path.replace("/", "") === "") {
if (cookie.replace("/", "") !== "") {
console.log(cookie)
document.getElementById("tip").innerHTML = "<a href='" + cookie + "'>跳转到上次进度</a>"
}
} else {
setCookie("lastPath", path)
}
function setCookie(cname, cvalue) {
var d = new Date();
d.setTime(d.getTime() + (180 * 24 * 60 * 60 * 1000));
var expires = "expires=" + d.toGMTString();
document.cookie = cname + "=" + cvalue + "; " + expires + ";path = /";
}
function getCookie(cname) {
var name = cname + "=";
var ca = document.cookie.split(';');
for (var i = 0; i < ca.length; i++) {
var c = ca[i].trim();
if (c.indexOf(name) === 0) return c.substring(name.length, c.length);
}
return "";
}
</script>
</html>