phoenix 如何优化成使用索引进行查询源码分析
2016-08-31 16:11
2331 查看
phoenix 如何优化成使用索引进行查询源码分析
在 phoenix 生成执行计划时,最后会进行执行计划的优化,在优化过程中,会根据当前的sql中有没有hint的提示使用索引,或者查询中用到了索引字段,也会自动转换成用索引表进行关联查询。
在下面的方法当中,会根据已有的执行计划,进行优化
QueryOptimizer.getApplicablePlans
//根据当前的数据执行计划,优化成用索引进行数据的查询 private List<QueryPlan> getApplicablePlans(QueryPlan dataPlan, PhoenixStatement statement, List<? extends PDatum> targetColumns, ParallelIteratorFactory parallelIteratorFactory, boolean stopAtBestPlan) throws SQLException { SelectStatement select = (SelectStatement)dataPlan.getStatement(); // Exit early if we have a point lookup as we can't get better than that if (!useIndexes || (dataPlan.getContext().getScanRanges().isPointLookup() && stopAtBestPlan)) { return Collections.singletonList(dataPlan); } // For single query tuple projection, indexes are inherited from the original table to the projected // table; otherwise not. So we pass projected table here, which is enough to tell if this is from a // single query or a part of join query. //判断是否使用索引进行处理 List<PTable>indexes = Lists.newArrayList(dataPlan.getContext().getResolver().getTables().get(0).getTable().getIndexes()); if (indexes.isEmpty() || dataPlan.isDegenerate() || dataPlan.getTableRef().hasDynamicCols() || select.getHint().hasHint(Hint.NO_INDEX)) { return Collections.singletonList(dataPlan); } // The targetColumns is set for UPSERT SELECT to ensure that the proper type conversion takes place. // For a SELECT, it is empty. In this case, we want to set the targetColumns to match the projection // from the dataPlan to ensure that the metadata for when an index is used matches the metadata for // when the data table is used. if (targetColumns.isEmpty()) { List<? extends ColumnProjector> projectors = dataPlan.getProjector().getColumnProjectors(); List<PDatum> targetDatums = Lists.newArrayListWithExpectedSize(projectors.size()); for (ColumnProjector projector : projectors) { targetDatums.add(projector.getExpression()); } targetColumns = targetDatums; } //在里面转换一下方式 SelectStatement translatedIndexSelect = IndexStatementRewriter.translate(select, FromCompiler.getResolver(dataPlan.getTableRef())); List<QueryPlan> plans = Lists.newArrayListWithExpectedSize(1 + indexes.size()); plans.add(dataPlan); //拿出指定的hint的查询方式 QueryPlan hintedPlan = getHintedQueryPlan(statement, translatedIndexSelect, indexes, targetColumns, parallelIteratorFactory, plans); if (hintedPlan != null) { if (stopAtBestPlan) { return Collections.singletonList(hintedPlan); } plans.add(0, hintedPlan); } for (PTable index : indexes) { //当没有指定hint的方式时 QueryPlan plan = addPlan(statement, translatedIndexSelect, index, targetColumns, parallelIteratorFactory, dataPlan, false); if (plan != null) { // Query can't possibly return anything so just return this plan. if (plan.isDegenerate()) { return Collections.singletonList(plan); } plans.add(plan); } } return hintedPlan == null ? orderPlansBestToWorst(select, plans) : plans; }
可以看到,上面根据当前是否具有hints提示和物理表是否创建了索引进行执行计划的优化
当使用hints提示时,会调用下面的方法
private static QueryPlan getHintedQueryPlan(PhoenixStatement statement, SelectStatement select, List<PTable> indexes, List<? extends PDatum> targetColumns, ParallelIteratorFactory parallelIteratorFactory, List<QueryPlan> plans) throws SQLException { QueryPlan dataPlan = plans.get(0); //查询语句中是否有hints String indexHint = select.getHint().getHint(Hint.INDEX); if (indexHint == null) { return null; } int startIndex = 0; String alias = dataPlan.getTableRef().getTableAlias(); //对hint的解释 (mytable myindex) String prefix = HintNode.PREFIX + (alias == null ? dataPlan.getTableRef().getTable().getName().getString() : alias) + HintNode.SEPARATOR; while (startIndex < indexHint.length()) { startIndex = indexHint.indexOf(prefix, startIndex); if (startIndex < 0) { return null; } startIndex += prefix.length(); boolean done = false; // true when SUFFIX found while (startIndex < indexHint.length() && !done) { int endIndex; int endIndex1 = indexHint.indexOf(HintNode.SEPARATOR, startIndex); int endIndex2 = indexHint.indexOf(HintNode.SUFFIX, startIndex); if (endIndex1 < 0 && endIndex2 < 0) { // Missing SUFFIX shouldn't happen endIndex = indexHint.length(); } else if (endIndex1 < 0) { done = true; endIndex = endIndex2; } else if (endIndex2 < 0) { endIndex = endIndex1; } else { endIndex = Math.min(endIndex1, endIndex2); done = endIndex2 == endIndex; } String indexName = indexHint.substring(startIndex, endIndex); int indexPos = getIndexPosition(indexes, indexName); if (indexPos >= 0) { // Hinted index is applicable, so return it's index PTable index = indexes.get(indexPos); indexes.remove(indexPos); //对hints解释成功了 QueryPlan plan = addPlan(statement, select, index, targetColumns, parallelIteratorFactory, dataPlan, true); if (plan != null) { return plan; } } startIndex = endIndex + 1; } } return null; }
上面解释出Sql中的查询的hints的提示,使用指定的索引进行查询,生成执行计划
//增加索引执行计划 private static QueryPlan addPlan(PhoenixStatement statement, SelectStatement select, PTable index, List<? extends PDatum> targetColumns, ParallelIteratorFactory parallelIteratorFactory, QueryPlan dataPlan, boolean isHinted) throws SQLException { int nColumns = dataPlan.getProjector().getColumnCount(); String tableAlias = dataPlan.getTableRef().getTableAlias(); String alias = tableAlias==null ? null : '"' + tableAlias + '"'; // double quote in case it's case sensitive String schemaName = index.getParentSchemaName().getString(); schemaName = schemaName.length() == 0 ? null : '"' + schemaName + '"'; String tableName = '"' + index.getTableName().getString() + '"'; TableNode table = FACTORY.namedTable(alias, FACTORY.table(schemaName, tableName)); //生成对新索引表的查询计划 SelectStatement indexSelect = FACTORY.select(select, table); ColumnResolver resolver = FromCompiler.getResolverForQuery(indexSelect, statement.getConnection()); // We will or will not do tuple projection according to the data plan. boolean isProjected = dataPlan.getContext().getResolver().getTables().get(0).getTable().getType() == PTableType.PROJECTED; // Check index state of now potentially updated index table to make sure it's active //当前索引状态是正常的 if (PIndexState.ACTIVE.equals(resolver.getTables().get(0).getTable().getIndexState())) { try { // translate nodes that match expressions that are indexed to the associated column parse node indexSelect = ParseNodeRewriter.rewrite(indexSelect, new IndexExpressionParseNodeRewriter(index, statement.getConnection(), indexSelect.getUdfParseNodes())); QueryCompiler compiler = new QueryCompiler(statement, indexSelect, resolver, targetColumns, parallelIteratorFactory, dataPlan.getContext().getSequenceManager(), isProjected); //生成新的查询计划 QueryPlan plan = compiler.compile(); // If query doesn't have where clause and some of columns to project are missing // in the index then we need to get missing columns from main table for each row in // local index. It's like full scan of both local index and data table which is inefficient. // Then we don't use the index. If all the columns to project are present in the index // then we can use the index even the query doesn't have where clause. if (index.getIndexType() == IndexType.LOCAL && indexSelect.getWhere() == null && !plan.getContext().getDataColumns().isEmpty()) { return null; } // Checking number of columns handles the wildcard cases correctly, as in that case the index // must contain all columns from the data table to be able to be used. //用到的索引是活动状态,并且要查询的字段全在索引表,就返回 if (plan.getTableRef().getTable().getIndexState() == PIndexState.ACTIVE) { if (plan.getProjector().getColumnCount() == nColumns) { //最后生成新的索引表的查询计划是可以的 return plan; } else if (index.getIndexType() == IndexType.GLOBAL) { //当依赖于全局索引表,但是全局表中又找不到要查询的字段,就出错,到下面通过子查询转换进行查询 throw new ColumnNotFoundException("*"); } } } catch (ColumnNotFoundException e) { /* Means that a column is being used that's not in our index. * Since we currently don't keep stats, we don't know the selectivity of the index. * For now, if this is a hinted plan, we will try rewriting the query as a subquery; * otherwise we just don't use this index (as opposed to trying to join back from * the index table to the data table. */ //当查询的字段不在索引表时,要关联到数据表进行数据的查询,通过子查询的方式进行数据的查询 SelectStatement dataSelect = (SelectStatement)dataPlan.getStatement(); ParseNode where = dataSelect.getWhere(); if (isHinted && where != null) { StatementContext context = new StatementContext(statement, resolver); WhereConditionRewriter whereRewriter = new WhereConditionRewriter(FromCompiler.getResolver(dataPlan.getTableRef()), context); where = where.accept(whereRewriter); if (where != null) { //这个是原始的数据表 PTable dataTable = dataPlan.getTableRef().getTable(); List<PColumn> pkColumns = dataTable.getPKColumns(); //要进行索引查询的字段 List<AliasedNode> aliasedNodes = Lists.<AliasedNode>newArrayListWithExpectedSize(pkColumns.size()); List<ParseNode> nodes = Lists.<ParseNode>newArrayListWithExpectedSize(pkColumns.size()); boolean isSalted = dataTable.getBucketNum() != null; boolean isTenantSpecific = dataTable.isMultiTenant() && statement.getConnection().getTenantId() != null; int posOffset = (isSalted ? 1 : 0) + (isTenantSpecific ? 1 : 0); for (int i = posOffset; i < pkColumns.size(); i++) { PColumn column = pkColumns.get(i); String indexColName = IndexUtil.getIndexColumnName(column); ParseNode indexColNode = new ColumnParseNode(null, '"' + indexColName + '"', indexColName); PDataType indexColType = IndexUtil.getIndexColumnDataType(column); PDataType dataColType = column.getDataType(); if (indexColType != dataColType) { indexColNode = FACTORY.cast(indexColNode, dataColType, null, null); } aliasedNodes.add(FACTORY.aliasedNode(null, indexColNode)); nodes.add(new ColumnParseNode(null, '"' + column.getName().getString() + '"')); } //生成一个内部子查询 SelectStatement innerSelect = FACTORY.select(indexSelect.getFrom(), indexSelect.getHint(), false, aliasedNodes, where, null, null, null, null, indexSelect.getBindCount(), false, indexSelect.hasSequence(), Collections.<SelectStatement>emptyList(), indexSelect.getUdfParseNodes()); ParseNode outerWhere = FACTORY.in(nodes.size() == 1 ? nodes.get(0) : FACTORY.rowValueConstructor(nodes), FACTORY.subquery(innerSelect, false), false, true); ParseNode extractedCondition = whereRewriter.getExtractedCondition(); if (extractedCondition != null) { outerWhere = FACTORY.and(Lists.newArrayList(outerWhere, extractedCondition)); } HintNode hint = HintNode.combine(HintNode.subtract(indexSelect.getHint(), new Hint[] {Hint.INDEX, Hint.NO_CHILD_PARENT_JOIN_OPTIMIZATION}), FACTORY.hint("NO_INDEX")); SelectStatement query = FACTORY.select(dataSelect, hint, outerWhere); ColumnResolver queryResolver = FromCompiler.getResolverForQuery(query, statement.getConnection()); //生成索引的子查询 query = SubqueryRewriter.transform(query, queryResolver, statement.getConnection()); queryResolver = FromCompiler.getResolverForQuery(query, statement.getConnection()); query = StatementNormalizer.normalize(query, queryResolver); QueryPlan plan = new QueryCompiler(statement, query, queryResolver, targetColumns, parallelIteratorFactory, dataPlan.getContext().getSequenceManager(), isProjected).compile(); return plan; } } } } return null; }
假如没有进行hints的提示,但是物理表建立了索引,则下面根据查询的物理表的字段中是否全部在索引表当中,如果是
则直接查询索引表返回数据,否则使用索引表作为子查询的方式,进行关联查询,从而使用索引表进行数据的快速查询
总结
根据当前是否具有hints提示,通过hints提示中使用的索引表进行查询
根据物理表建立的索引,当查询字段全在索引表时,使用索引表进行直接查询,否则生成子查询对索引表进行查询
相关文章推荐
- mac如何使用infer进行maven项目源码静态分析
- MySQL查询不使用索引汇总 + 如何优化sql语句
- 优化 SQL Server 查询性能----分析执行计划,索引与索引视图,如何识别要优化的查询
- 优化 SQL Server 查询性能----分析执行计划,索引与索引视图,如何识别要优化的查询
- oracle如何进行索引监控分析和优化
- [MySQL 工具] pt-index-usage使用 --> 对查询中的索引进行分析
- mysql的优化(表的设计,优化步骤,四种索引,分析慢查询,使用索引的深入解析,存储引擎分析,表的分割,数据库配置)
- 深入理解Oracle索引(8):如何进行索引监控分析和优化
- 数据库查询性能优化(合理使用索引|避免或简化排序|避免对大型表进行全表顺序扫描|避免使用相关的子查询|避免使用通配符匹配 )
- 使用MapInfo Professional 进行空间数据查询、分析与数据转换
- 数据库查询的优化——索引使用的注意点(转)
- 索引原理及项目中如何使用索引实例分析
- ABAP--在查询条件只包含部分索引字段时,如何使用索引
- 在SQLite中使用索引优化查询速度
- 数据查询优化之使用索引
- ABAP--在查询条件只包含部分索引字段时,如何使用索引
- 数据库查询的优化——索引使用的注意点
- 使用Explain进行查询及应用优化
- 分区表的作用和使用深入分析(如何用分区表来优化数据库)
- ABAP--在查询条件只包含部分索引字段时,如何使用索引