lkl风控.逻辑回归分析模型测试代码spark1.6
2017-10-31 17:08
483 查看
/** * Created by lkl on 2017/10/31. */ import org.apache.spark.sql.hive.HiveContext import org.apache.spark.SparkConf import scala.collection.mutable.ArrayBuffer import org.apache.spark.SparkContext import org.apache.spark.mllib.classification.{LogisticRegressionWithLBFGS, LogisticRegressionModel} import org.apache.spark.mllib.evaluation.MulticlassMetrics import org.apache.spark.mllib.regression.LabeledPoint import org.apache.spark.mllib.linalg.Vectors import org.apache.spark.mllib.util.MLUtils import org.apache.spark.mllib.linalg.Vectors import org.apache.spark.sql.SQLContext import org.apache.spark.mllib.regression.LabeledPoint import org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator object logisticregression { def main(args: Array[String]): Unit = { val conf = new SparkConf().setAppName("test") //setMaster("spark://192.168.0.37:7077") val sc = new SparkContext(conf) val sqlContext = new SQLContext(sc) val hc = new HiveContext(sc) val data2 = hc.sql("select * from fin_tec.uvcy2") val data = data2.map{ row => val arr = new ArrayBuffer[Double]() // 第一个变量为身份证号,第二个变量为是否逾期,先过滤 for(i <- 2 until row.size){ if(row.isNullAt(i)){ arr += 0.0} else if(row.get(i).isInstanceOf[Double]) arr += row.getDouble(i) else if(row.get(i).isInstanceOf[Long]) arr += row.getLong(i).toDouble else if(row.get(i).isInstanceOf[String]) arr += row.getString(i).toDouble} LabeledPoint(row.getDouble(1), Vectors.dense(arr.toArray))} val splits = data.randomSplit(Array(0.7, 0.3)) val (trainingData, testData) = (splits(0), splits(1)) val training = splits(0).cache() val test = splits(1) // Run training algorithm to build the model val model = new LogisticRegressionWithLBFGS().setNumClasses(10).run(training) // Compute raw scores on the test set. val predictionAndLabels = test.map { case LabeledPoint(label, features) => val prediction = model.predict(features) (prediction, label) } // Get evaluation metrics. val metrics = new MulticlassMetrics(predictionAndLabels) val precision = metrics.precision println("Precision = " + precision) // Save and load model model.save(sc, "myModelPath") val sameModel = LogisticRegressionModel.load(sc, "myModelPath") } }
相关文章推荐
- lkl风控.随机森林模型测试代码spark1.6
- [PAL编程规范]SAP HANA PAL逻辑回归预测分析Logistic Regression编程规范LOGISTICREGRESSION(模型)
- [深度学习]Python/Theano实现逻辑回归网络的代码分析
- Spark 1.6 内存管理模型( Unified Memory Management)分析
- 逻辑回归模型算法研究与案例分析
- 【Python学习系列十七】基于scikit-learn库逻辑回归训练模型(delta比赛代码2)
- Spark 1.6 内存管理模型( Unified Memory Management)分析
- Spark 1.6 内存管理模型( Unified Memory Management)分析
- 【Python学习系列十八】基于scikit-learn库逻辑回归训练模型(delta比赛代码3)
- 利用Spark-mllab进行聚类,分类,回归分析的代码实现(python)
- 逻辑回归模型分析
- 《Spark商业案例与性能调优实战100课》第19课:商业案例之NBA篮球运动员大数据分析核心业务逻辑代码实战
- 使用spark建立逻辑回归(Logistic)模型帮Helen找男朋友
- 【Python学习系列十六】基于scikit-learn库逻辑回归训练模型(delta比赛代码)
- [深度学习]Python/Theano实现逻辑回归网络的代码分析
- [Step By Step]SAP HANA PAL逻辑回归预测分析Logistic Regression编程实例LOGISTICREGRESSION(模型)
- spark2.0中逻辑回归模型
- 聚类、逻辑回归、主成分与因子分析等几类模型要点
- Spark-mllib源码分析之逻辑回归(Logistic Regression)
- Spark1.6 内存管理模型( Unified Memory Management)分析