scikit-learn 实现了机器学习的大部分基础算法,让我们快速了解一下。 1.逻辑回归大多数问题都可以归结为二元分类问题。这个算法的优点是可以给出数据所在类别的概 率。 from sklearn import metrics
from sklearn.linear_model import LogisticRegression model = LogisticRegression()
model.fit(X, y)
print(model)
# make predictions
expected = y
predicted = model.predict(X)
# summarize the fit of the model
print(metrics.classification_report(expected, predicted)) print(metrics.confusion_matrix(expected, predicted)) 结果:
2.朴素贝叶斯这也是著名的机器学习算法,该方法的任务是还原训练样本数据的分布密度,其在多类 别分类中有很好的效果。 from sklearn import metrics
from sklearn.naive_bayes import GaussianNB
model = GaussianNB()
model.fit(X, y)
print(model)
# make predictions
expected = y
predicted = model.predict(X)
# summarize the fit of the model print(metrics.classification_report(expected, predicted)) print(metrics.confusion_matrix(expected, predicted)) 结果:
3.k 近邻
k 近邻算法常常被用作是分类算法一部分,比如可以用它来评估特征,在特征选择上我 们可以用到它。 from sklearn import metrics
from sklearn.neighbors import KNeighborsClassifier
# fit a k-nearest neighbor model to the data
model = KNeighborsClassifier()
model.fit(X, y)
print(model)
# make predictions
expected = y
predicted = model.predict(X)
# summarize the fit of the model print(metrics.classification_report(expected, predicted)) print(metrics.confusion_matrix(expected, predicted)) 结果:
4.决策树
分类与回归树(Classification and Regression Trees ,CART)算法常用于特征含有类 别信息的分类或者回归问题,这种方法非常适用于多分类情况。 from sklearn import metrics
from sklearn.tree import DecisionTreeClassifier
# fit a CART model to the data
model = DecisionTreeClassifier()
model.fit(X, y)
print(model)
# make predictions
expected = y
predicted = model.predict(X)
# summarize the fit of the model print(metrics.classification_report(expected, predicted)) print(metrics.confusion_matrix(expected, predicted)) 结果:
5.支持向量机
SVM 是非常流行的机器学习算法,主要用于分类问题,如同逻辑回归问题,它可以使用 一对多的方法进行多类别的分类。from sklearn import metrics
from sklearn.svm import SVC
# fit a SVM model to the data
model = SVC()
model.fit(X, y)
print(model)
# make predictions
expected = y
predicted = model.predict(X)
# summarize the fit of the model print(metrics.classification_report(expected, predicted)) print(metrics.confusion_matrix(expected, predicted)) 结果:
除了分类和回归算法外,scikit-learn 提供了更加复杂的算法,比如聚类算法,还实 现了算法组合的技术,如 Bagging 和 Boosting 算法。 【转载】原文地址:https://blog.csdn.net/Ren_Gong_Zhi_Neng/article/details/80745377
|
|