Naive Bayes¶
toyml.classification.naive_bayes.BaseNaiveBayes
dataclass
¶
Bases: ABC
class_prior_
class-attribute
instance-attribute
¶
The prior probability of each class in training dataset
predict
¶
Predict the class label for a given sample.
PARAMETER | DESCRIPTION |
---|---|
sample
|
A single sample to predict, represented as a list of feature values.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
int
|
Predicted class label.
TYPE:
|
Source code in toyml/classification/naive_bayes.py
24 25 26 27 28 29 30 31 32 33 34 35 |
|
predict_proba
¶
Predict class probabilities for a given sample.
PARAMETER | DESCRIPTION |
---|---|
sample
|
A single sample to predict, represented as a list of feature values.
TYPE:
|
normalization
|
Whether to normalize the probabilities. Default is True.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
dict[Class, float]
|
dict[int, float]: Dictionary mapping class labels to their predicted probabilities. |
Source code in toyml/classification/naive_bayes.py
37 38 39 40 41 42 43 44 45 46 47 48 |
|
predict_log_proba
¶
Predict log probabilities for a given sample.
PARAMETER | DESCRIPTION |
---|---|
sample
|
A single sample to predict, represented as a list of feature values.
TYPE:
|
normalization
|
Whether to normalize the log probabilities. Default is True.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
dict[Class, float]
|
dict[int, float]: Dictionary mapping class labels to their predicted log probabilities. |
Source code in toyml/classification/naive_bayes.py
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
|
toyml.classification.naive_bayes.GaussianNaiveBayes
dataclass
¶
GaussianNaiveBayes(class_prior_: dict[Class, float] = dict(), unbiased_variance: bool = True, var_smoothing: float = 1e-09, labels_: list[Class] = list(), class_count_: int = 0, means_: dict[Class, list[float]] = dict(), variances_: dict[Class, list[float]] = dict(), epsilon_: float = 0)
Bases: BaseNaiveBayes
Gaussian naive bayes classification algorithm implementation.
Examples:
>>> label = [0, 0, 0, 0, 1, 1, 1, 1]
>>> dataset = [
... [6.00, 180, 12],
... [5.92, 190, 11],
... [5.58, 170, 12],
... [5.92, 165, 10],
... [5.00, 100, 6],
... [5.50, 150, 8],
... [5.42, 130, 7],
... [5.75, 150, 9],
... ]
>>> clf = GaussianNaiveBayes().fit(dataset, label)
>>> clf.predict([6.00, 130, 8])
1
unbiased_variance
class-attribute
instance-attribute
¶
unbiased_variance: bool = True
Use the unbiased variance estimation or not. Default is True.
var_smoothing
class-attribute
instance-attribute
¶
var_smoothing: float = 1e-09
Portion of the largest variance of all features that is added to variances for calculation stability.
labels_
class-attribute
instance-attribute
¶
The labels in training dataset
class_count_
class-attribute
instance-attribute
¶
class_count_: int = 0
The number of classes in training dataset
class_prior_
class-attribute
instance-attribute
¶
The prior probability of each class in training dataset
means_
class-attribute
instance-attribute
¶
The means of each class in training dataset
variances_
class-attribute
instance-attribute
¶
The variance of each class in training dataset
epsilon_
class-attribute
instance-attribute
¶
epsilon_: float = 0
The absolute additive value to variances.
fit
¶
fit(dataset: list[list[FeatureValue]], labels: list[Class]) -> GaussianNaiveBayes
Fit the Gaussian Naive Bayes classifier.
PARAMETER | DESCRIPTION |
---|---|
dataset
|
Training data, where each row is a sample and each column is a feature. |
labels
|
Target labels for training data.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
self
|
Returns the instance itself.
TYPE:
|
Source code in toyml/classification/naive_bayes.py
120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 |
|
_log_likelihood
¶
Calculate the likelihood of each sample in each class.
Source code in toyml/classification/naive_bayes.py
137 138 139 140 141 142 143 144 145 146 147 148 149 150 |
|
_dataset_column_means
staticmethod
¶
Calculate vectors mean.
Source code in toyml/classification/naive_bayes.py
166 167 168 169 |
|
_dataset_column_variances
¶
Calculate vectors(every column) standard variance.
Source code in toyml/classification/naive_bayes.py
171 172 173 |
|
toyml.classification.naive_bayes.MultinomialNaiveBayes
dataclass
¶
MultinomialNaiveBayes(class_prior_: dict[Class, float] = dict(), alpha: float = 1.0, labels_: list[Class] = list(), class_count_: int = 0, class_feature_count_: dict[Class, list[int]] = dict(), class_feature_log_prob_: dict[Class, list[float]] = dict())
Bases: BaseNaiveBayes
Multinomial Naive Bayes classifier.
Examples:
>>> import random
>>> rng = random.Random(0)
>>> dataset = [[rng.randint(0, 5) for _ in range(100)] for _ in range(6)]
>>> label = [1, 2, 3, 4, 5, 6]
>>> clf = MultinomialNaiveBayes().fit(dataset, label)
>>> clf.predict(dataset[2])
3
alpha
class-attribute
instance-attribute
¶
alpha: float = 1.0
Additive (Laplace/Lidstone) smoothing parameter
labels_
class-attribute
instance-attribute
¶
The labels in training dataset
class_count_
class-attribute
instance-attribute
¶
class_count_: int = 0
The number of classes in training dataset
class_prior_
class-attribute
instance-attribute
¶
The prior probability of each class in training dataset
class_feature_count_
class-attribute
instance-attribute
¶
The feature value counts of each class in training dataset
class_feature_log_prob_
class-attribute
instance-attribute
¶
The feature value probability of each class in training dataset
fit
¶
fit(dataset: list[list[FeatureValue]], labels: list[Class]) -> MultinomialNaiveBayes
Fit the Multinomial Naive Bayes classifier.
PARAMETER | DESCRIPTION |
---|---|
dataset
|
Training data, where each row is a sample and each column is a feature. Features should be represented as counts (non-negative integers). |
labels
|
Target labels for training data.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
self
|
Returns the instance itself.
TYPE:
|
Source code in toyml/classification/naive_bayes.py
217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 |
|
_log_likelihood
¶
Calculate the likelihood of each sample in each class.
Source code in toyml/classification/naive_bayes.py
235 236 237 238 239 240 241 242 243 244 |
|
_dataset_feature_counts
¶
Calculate feature value counts.
Source code in toyml/classification/naive_bayes.py
262 263 264 |
|
toyml.classification.naive_bayes.CategoricalNaiveBayes
dataclass
¶
CategoricalNaiveBayes(class_prior_: dict[Class, float] = dict(), alpha: float = 1.0, labels_: list[Class] = list(), class_count_: int = 0, class_feature_count_: dict[Class, dict[Dimension, dict[FeatureValue, float]]] = dict(), class_feature_log_prob_: dict[Class, dict[Dimension, dict[FeatureValue, float]]] = dict())
Bases: BaseNaiveBayes
Categorical Naive Bayes classifier.
Examples:
>>> import random
>>> rng = random.Random(0)
>>> dataset = [[rng.randint(0, 5) for _ in range(100)] for _ in range(6)]
>>> label = [1, 2, 3, 4, 5, 6]
>>> clf = CategoricalNaiveBayes().fit(dataset, label)
>>> clf.predict(dataset[2])
3
alpha
class-attribute
instance-attribute
¶
alpha: float = 1.0
Additive (Laplace/Lidstone) smoothing parameter
labels_
class-attribute
instance-attribute
¶
The labels in training dataset
class_count_
class-attribute
instance-attribute
¶
class_count_: int = 0
The number of classes in training dataset
class_prior_
class-attribute
instance-attribute
¶
The prior probability of each class in training dataset
class_feature_count_
class-attribute
instance-attribute
¶
class_feature_count_: dict[Class, dict[Dimension, dict[FeatureValue, float]]] = field(default_factory=dict)
The feature value counts of each class in training dataset
class_feature_log_prob_
class-attribute
instance-attribute
¶
class_feature_log_prob_: dict[Class, dict[Dimension, dict[FeatureValue, float]]] = field(default_factory=dict)
The feature value probability of each class in training dataset
fit
¶
fit(dataset: list[list[FeatureValue]], labels: list[Class]) -> CategoricalNaiveBayes
Fit the Categorical Naive Bayes classifier.
PARAMETER | DESCRIPTION |
---|---|
dataset
|
Training data, where each row is a sample and each column is a feature. |
labels
|
Target labels for training data.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
self
|
Returns the instance itself.
TYPE:
|
Source code in toyml/classification/naive_bayes.py
294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 |
|
_log_likelihood
¶
Calculate the likelihood of each sample in each class.
Source code in toyml/classification/naive_bayes.py
311 312 313 314 315 316 317 318 319 320 |
|
_dataset_feature_counts
staticmethod
¶
_dataset_feature_counts(dataset: list[list[FeatureValue]], feature_smooth_count: dict[Dimension, dict[FeatureValue, float]]) -> dict[Dimension, dict[FeatureValue, float]]
Calculate feature value counts.
Source code in toyml/classification/naive_bayes.py
348 349 350 351 352 353 354 355 356 357 358 359 |
|