您的位置:首页 > 编程语言 > Java开发

ElasticSearch学习笔记-相关度得分记录

2017-09-26 20:47 429 查看
最近想要修改调整一下ElasticSearch里面Doc的Score,于是在ES官网查阅了一下,相关的介绍和说明还是非常详细的,能做的修改调整也还是比较多的,需要根据具体的情形去选择相应的方式做合适的调整修改,这里做个简单的记录,以便后续使用方便。

相关解释描述可以参考连接:
https://www.elastic.co/guide/cn/elasticsearch/guide/current/controlling-relevance.html https://www.elastic.co/guide/cn/elasticsearch/guide/current/function-score-query.html https://www.elastic.co/guide/cn/elasticsearch/guide/current/boosting-by-popularity.html https://www.elastic.co/guide/cn/elasticsearch/guide/current/function-score-filters.html https://www.elastic.co/guide/cn/elasticsearch/guide/current/random-scoring.html https://www.elastic.co/guide/cn/elasticsearch/guide/current/decay-functions.html https://www.elastic.co/guide/cn/elasticsearch/guide/current/script-score.html
以下是API操作的部分记录:

ScoreFunctionBuilder scoreFunctionBuilder =
ScoreFunctionBuilders.fieldValueFactorFunction(fieldName)
.factor(boostFactor).modifier(modifier).missing(missing).setWeight(weight);
线性衰减函数  一旦直线与横轴 0 相交,所有其他值的评分都是 0.0
ScoreFunctionBuilders.linearDecayFunction(fieldName, origin, scale)
.setDecay(decay).setOffset(offset).setWeight(weight)
指数衰减函数  先剧烈衰减然后变缓
ScoreFunctionBuilders.exponentialDecayFunction(fieldName, origin, scale)
.setDecay(decay).setOffset(offset).setWeight(weight)
高斯衰减函数  高斯函数是钟形的——它的衰减速率是先缓慢,然后变快,最后又放缓。
ScoreFunctionBuilders.gaussDecayFunction(fieldName, origin, scale)
.setDecay(decay).setOffset(offset).setWeight(weight)
origin  中心点或字段可能的最佳值,落在原点 origin上的文档评分 _score 为满分 1.0 。
scale  衰减率,即一个文档从原点 origin下落时,评分 _score改变的速度。
decay  从原点 origin衰减到 scale所得的评分 _score,默认值为 0.5 。
offset  以原点 origin为中心点,为其设置一个非零的偏移量 offset覆盖一个范围,而不只是单个原点。
在范围 -offset <= origin <= +offset内的所有评分 _score 都是 1.0 。
随机评分
ScoreFunctionBuilders.randomFunction(seed)
权重因素
ScoreFunctionBuilders.weightFactorFunction(weight)


脚本评分

String timeField = getTypeTimeField(type);
if (StringUtils.isBlank(timeField)) {
searchRequestBuilder.setQuery(buildBoolQuery(params.keywords(), attributes));
} else {
String inlineScript = ElasticScriptUtils.scriptWithScoreAndTime(timeField, System.currentTimeMillis());
Map<String, Object> sparams = new HashMap<>();
Script script = new Script(inlineScript, ScriptType.INLINE, "groovy", sparams);
ScoreFunctionBuilder scoreFunctionBuilder = ScoreFunctionBuilders.scriptFunction(script);
searchRequestBuilder.setQuery(QueryBuilders.functionScoreQuery(
buildBoolQuery(params.keywords(), attributes), scoreFunctionBuilder));
}

public class ElasticScriptUtils {

public static String scriptWithScoreAndTimeV1(String field, long currentTime) {
String script = ""
+ "field = (null==_source." + field + "?\"1970-01-01 12:00:00\":source." + field + ");"
+ "format = \"\";"
+ "ds = field.trim().split(\":\");"
+ "if (ds.length == 3) {"
+ "    format = \"yyyy-MM-dd HH:mm:ss\";"
+ "} else if (ds.length == 2) {"
+ "	   format = \"yyyy-MM-dd HH:mm\";"
+ "} else if (ds.length == 1) {"
+ "	   ds = d.trim().split(\" \");"
+ "	   if (ds.length == 2) {"
+ "	       format = \"yyyy-MM-dd HH\";"
+ "	   } else if (ds.length == 1) {"
+ "	       ds = d.trim().split(\"-\");"
+ "	       if (ds.length == 3) {"
+ "	           format = \"yyyy-MM-dd\";"
+ "	       } else if (ds.length == 2) {"
+ "	           format = \"yyyy-MM\";"
+ "	       } else if (ds.length == 1) {"
+ "	           format = \"yyyy\";"
+ "	       }"
+ "	   }"
+ "};"
+ "parse_date = Date.parse(format, field).getTime();"
+ "return _score.doubleValue() + (parse_date / " + currentTime + ");";
return script;
}

public static String scriptWithScoreAndTime(String fields, long currentTime) {
String script = ""
+ "temp_fields = \"" + fields + "\".trim().split(\",\");"
+ "target_field = temp_fields[0];"
+ "temp_field = _source.target_field;"
+ "for (i in 2.. temp_fields.length) {"
+ "    if (null != temp_field) break;"
+ "    target_field = temp_fields[i-1];"
+ "    temp_field = _source.target_field;"
+ "};"
+ "field = (null==temp_field ? \"1970-01-01 12:00:00\" : temp_field);"
+ "format = \"\";"
+ "ds = field.trim().split(\":\");"
+ "if (ds.length == 3) {"
+ "    format = \"yyyy-MM-dd HH:mm:ss\";"
+ "} else if (ds.length == 2) {"
+ "	   format = \"yyyy-MM-dd HH:mm\";"
+ "} else if (ds.length == 1) {"
+ "	   ds = d.trim().split(\" \");"
+ "	   if (ds.length == 2) {"
+ "	       format = \"yyyy-MM-dd HH\";"
+ "	   } else if (ds.length == 1) {"
+ "	       ds = d.trim().split(\"-\");"
+ "	       if (ds.length == 3) {"
+ "	           format = \"yyyy-MM-dd\";"
+ "	       } else if (ds.length == 2) {"
+ "	           format = \"yyyy-MM\";"
+ "	       } else if (ds.length == 1) {"
+ "	           format = \"yyyy\";"
+ "	       }"
+ "	   }"
+ "};"
+ "parse_date = Date.parse(format, field).getTime();"
/**
+ "println _score.doubleValue(); println (parse_date / " + currentTime + ");"
**/
+ "return _score.doubleValue() + (parse_date / " + currentTime + ");";
return script;
}

}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息