A股上市公司传智教育(股票代码 003032)旗下技术交流社区北京昌平校区

 找回密码
 加入黑马

QQ登录

只需一步,快速开始

本文介绍Druid查询数据的方式,首先我们保证数据已经成功载入。
Druid查询基于HTTP,Druid提供了查询视图,并对结果进行了格式化。
Druid提供了三种查询方式,SQL,原生JSON,CURL。
一、SQL查询
我们用wiki的数据为例
查询10条最多的页面编辑
SELECT page, COUNT(*) AS EditsFROM wikipediaWHERE TIMESTAMP '2015-09-12 00:00:00' <= "__time" AND "__time" < TIMESTAMP '2015-09-13 00:00:00'GROUP BY pageORDER BY Edits DESCLIMIT 10我们在Query视图中操作
会有提示
选择Smart query limit会自动限制行数
Druid还提供了命令行查询sql 可以运行bin/dsql进行操作Welcome to dsql, the command-line client for Druid SQL.Type "\h" for help.dsql>
提交sql
dsql> SELECT page, COUNT(*) AS Edits FROM wikipedia WHERE "__time" BETWEEN TIMESTAMP '2015-09-12 00:00:00' AND TIMESTAMP '2015-09-13 00:00:00' GROUP BY page ORDER BY Edits DESC LIMIT 10;┌──────────────────────────────────────────────────────────┬───────┐│ page                                                     │ Edits │├──────────────────────────────────────────────────────────┼───────┤│ Wikipedia:Vandalismusmeldung                             │    33 ││ User:Cyde/List of candidates for speedy deletion/Subpage │    28 ││ Jeremy Corbyn                                            │    27 ││ Wikipedia:Administrators' noticeboard/Incidents          │    21 ││ Flavia Pennetta                                          │    20 ││ Total Drama Presents: The Ridonculous Race               │    18 ││ User talk:Dudeperson176123                               │    18 ││ Wikipédia:Le Bistro/12 septembre 2015                    │    18 ││ Wikipedia:In the news/Candidates                         │    17 ││ Wikipedia:Requests for page protection                   │    17 │└──────────────────────────────────────────────────────────┴───────┘Retrieved 10 rows in 0.06s.还可以通过Http发送SQLcurl -X 'POST' -H 'Content-Type:application/json' -d @quickstart/tutorial/wikipedia-top-pages-sql.json http://localhost:8888/druid/v2/sql
可以得到如下结果
[  {    "page": "Wikipedia:Vandalismusmeldung",    "Edits": 33  },  {    "page": "User:Cyde/List of candidates for speedy deletion/Subpage",    "Edits": 28  },  {    "page": "Jeremy Corbyn",    "Edits": 27  },  {    "page": "Wikipedia:Administrators' noticeboard/Incidents",    "Edits": 21  },  {    "page": "Flavia Pennetta",    "Edits": 20  },  {    "page": "Total Drama Presents: The Ridonculous Race",    "Edits": 18  },  {    "page": "User talk:Dudeperson176123",    "Edits": 18  },  {    "page": "Wikipédia:Le Bistro/12 septembre 2015",    "Edits": 18  },  {    "page": "Wikipedia:In the news/Candidates",    "Edits": 17  },  {    "page": "Wikipedia:Requests for page protection",    "Edits": 17  }]更多SQL示例
时间查询
SELECT FLOOR(__time to HOUR) AS HourTime, SUM(deleted) AS LinesDeletedFROM wikipedia WHERE "__time" BETWEEN TIMESTAMP '2015-09-12 00:00:00' AND TIMESTAMP '2015-09-13 00:00:00'GROUP BY 1
分组查询
SELECT channel, page, SUM(added)FROM wikipedia WHERE "__time" BETWEEN TIMESTAMP '2015-09-12 00:00:00' AND TIMESTAMP '2015-09-13 00:00:00'GROUP BY channel, pageORDER BY SUM(added) DESC
查询原始数据
SELECT user, pageFROM wikipedia WHERE "__time" BETWEEN TIMESTAMP '2015-09-12 02:00:00' AND TIMESTAMP '2015-09-12 03:00:00'LIMIT 5
定时查询
也可以在dsql里操作
dsql> EXPLAIN PLAN FOR SELECT page, COUNT(*) AS Edits FROM wikipedia WHERE "__time" BETWEEN TIMESTAMP '2015-09-12 00:00:00' AND TIMESTAMP '2015-09-13 00:00:00' GROUP BY page ORDER BY Edits DESC LIMIT 10;│ DruidQueryRel(query=[{"queryType":"topN","dataSource":{"type":"table","name":"wikipedia"},"virtualColumns":[],"dimension":{"type":"default","dimension":"page","outputName":"d0","outputType":"STRING"},"metric":{"type":"numeric","metric":"a0"},"threshold":10,"intervals":{"type":"intervals","intervals":["2015-09-12T00:00:00.000Z/2015-09-13T00:00:00.001Z"]},"filter":null,"granularity":{"type":"all"},"aggregations":[{"type":"count","name":"a0"}],"postAggregations":[],"context":{},"descending":false}], signature=[{d0:STRING, a0:LONG}]) │└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘Retrieved 1 row in 0.03s.二、原生JSON查询
Druid支持基于Json的查询
{  "queryType" : "topN",  "dataSource" : "wikipedia",  "intervals" : ["2015-09-12/2015-09-13"],  "granularity" : "all",  "dimension" : "page",  "metric" : "count",  "threshold" : 10,  "aggregations" : [    {      "type" : "count",      "name" : "count"    }  ]}
把json粘贴到json 查询模式窗口
Json查询是通过向router和broker发送请求
curl -X POST '<queryable_host>:<port>/druid/v2/?pretty' -H 'Content-Type:application/json' -H 'Accept:application/json' -d @<query_json_file>
Druid提供了丰富的查询方式
Aggregation查询Timeseries查询{  "queryType": "timeseries",  "dataSource": "sample_datasource",  "granularity": "day",  "descending": "true",  "filter": {    "type": "and",    "fields": [      { "type": "selector", "dimension": "sample_dimension1", "value": "sample_value1" },      { "type": "or",        "fields": [          { "type": "selector", "dimension": "sample_dimension2", "value": "sample_value2" },          { "type": "selector", "dimension": "sample_dimension3", "value": "sample_value3" }        ]      }    ]  },  "aggregations": [    { "type": "longSum", "name": "sample_name1", "fieldName": "sample_fieldName1" },    { "type": "doubleSum", "name": "sample_name2", "fieldName": "sample_fieldName2" }  ],  "postAggregations": [    { "type": "arithmetic",      "name": "sample_divide",      "fn": "/",      "fields": [        { "type": "fieldAccess", "name": "postAgg__sample_name1", "fieldName": "sample_name1" },        { "type": "fieldAccess", "name": "postAgg__sample_name2", "fieldName": "sample_name2" }      ]    }  ],  "intervals": [ "2012-01-01T00:00:00.000/2012-01-03T00:00:00.000" ]}TopN查询{  "queryType": "topN",  "dataSource": "sample_data",  "dimension": "sample_dim",  "threshold": 5,  "metric": "count",  "granularity": "all",  "filter": {    "type": "and",    "fields": [      {        "type": "selector",        "dimension": "dim1",        "value": "some_value"      },      {        "type": "selector",        "dimension": "dim2",        "value": "some_other_val"      }    ]  },  "aggregations": [    {      "type": "longSum",      "name": "count",      "fieldName": "count"    },    {      "type": "doubleSum",      "name": "some_metric",      "fieldName": "some_metric"    }  ],  "postAggregations": [    {      "type": "arithmetic",      "name": "average",      "fn": "/",      "fields": [        {          "type": "fieldAccess",          "name": "some_metric",          "fieldName": "some_metric"        },        {          "type": "fieldAccess",          "name": "count",          "fieldName": "count"        }      ]    }  ],  "intervals": [    "2013-08-31T00:00:00.000/2013-09-03T00:00:00.000"  ]}GroupBy查询{  "queryType": "groupBy",  "dataSource": "sample_datasource",  "granularity": "day",  "dimensions": ["country", "device"],  "limitSpec": { "type": "default", "limit": 5000, "columns": ["country", "data_transfer"] },  "filter": {    "type": "and",    "fields": [      { "type": "selector", "dimension": "carrier", "value": "AT&T" },      { "type": "or",        "fields": [          { "type": "selector", "dimension": "make", "value": "Apple" },          { "type": "selector", "dimension": "make", "value": "Samsung" }        ]      }    ]  },  "aggregations": [    { "type": "longSum", "name": "total_usage", "fieldName": "user_count" },    { "type": "doubleSum", "name": "data_transfer", "fieldName": "data_transfer" }  ],  "postAggregations": [    { "type": "arithmetic",      "name": "avg_usage",      "fn": "/",      "fields": [        { "type": "fieldAccess", "fieldName": "data_transfer" },        { "type": "fieldAccess", "fieldName": "total_usage" }      ]    }  ],  "intervals": [ "2012-01-01T00:00:00.000/2012-01-03T00:00:00.000" ],  "having": {    "type": "greaterThan",    "aggregation": "total_usage",    "value": 100  }}Metadata查询TimeBoundary 查询{    "queryType" : "timeBoundary",    "dataSource": "sample_datasource",    "bound"     : < "maxTime" | "minTime" > # optional, defaults to returning both timestamps if not set    "filter"    : { "type": "and", "fields": [<filter>, <filter>, ...] } # optional}SegmentMetadata查询{  "queryType":"segmentMetadata",  "dataSource":"sample_datasource",  "intervals":["2013-01-01/2014-01-01"]}DatasourceMetadata查询{    "queryType" : "dataSourceMetadata",    "dataSource": "sample_datasource"}Search查询{  "queryType": "search",  "dataSource": "sample_datasource",  "granularity": "day",  "searchDimensions": [    "dim1",    "dim2"  ],  "query": {    "type": "insensitive_contains",    "value": "Ke"  },  "sort" : {    "type": "lexicographic"  },  "intervals": [    "2013-01-01T00:00:00.000/2013-01-03T00:00:00.000"  ]}
查询建议
用Timeseries和TopN替代GroupBy
取消查询
DELETE /druid/v2/{queryId}curl -X DELETE "http://host:port/druid/v2/abc123"
查询失败
{  "error" : "Query timeout",  "errorMessage" : "Timeout waiting for task.",  "errorClass" : "java.util.concurrent.TimeoutException",  "host" : "druid1.example.com:8083"}三、CURL
基于Http的查询
curl -X 'POST' -H 'Content-Type:application/json' -d @quickstart/tutorial/wikipedia-top-pages.json http://localhost:8888/druid/v2?pretty四、客户端查询
客户端查询是基于json的
比如python查询的pydruid
from pydruid.client import *from pylab import pltquery = PyDruid(druid_url_goes_here, 'druid/v2')ts = query.timeseries(    datasource='twitterstream',    granularity='day',    intervals='2014-02-02/p4w',    aggregations={'length': doublesum('tweet_length'), 'count': doublesum('count')},    post_aggregations={'avg_tweet_length': (Field('length') / Field('count'))},    filter=Dimension('first_hashtag') == 'sochi2014')df = query.export_pandas()df['timestamp'] = df['timestamp'].map(lambda x: x.split('T')[0])df.plot(x='timestamp', y='avg_tweet_length', ylim=(80, 140), rot=20,        title='Sochi 2014')plt.ylabel('avg tweet length (chars)')plt.show()

1 个回复

倒序浏览
以上内容转载自网络
更多讯息欢迎添加小优:DKA-2018
回复 使用道具 举报
您需要登录后才可以回帖 登录 | 加入黑马