Understanding Naive Bayes

Share Post:

[et_pb_section fb_built=”1″ fullwidth=”on” admin_label=”Page Header” _builder_version=”4.4.1″ background_color=”#898989″ background_image=”https://verticecloud.com/wp-content/uploads/2020/04/NB-ML-Vertice.png” animation_direction=”top” global_colors_info=”{}”][et_pb_fullwidth_post_title comments=”off” featured_image=”off” text_background=”on” text_bg_color=”rgba(247,247,247,0.83)” _builder_version=”3.22.7″ title_font=”Poppins|600|||||||” title_text_color=”#000000″ title_font_size=”34px” title_line_height=”1.4em” meta_font=”Poppins||||||||” meta_text_color=”#000000″ use_background_color_gradient=”on” background_color_gradient_start=”rgba(198,198,198,0.2)” background_color_gradient_end=”rgba(140,140,140,0.68)” custom_margin=”||25px|||” custom_padding=”272px||97px|||” global_colors_info=”{}”][/et_pb_fullwidth_post_title][/et_pb_section][et_pb_section fb_built=”1″ _builder_version=”4.0.9″ background_color=”#ffffff” custom_margin=”-29px||-13px|||” custom_padding=”29px||36px|||” border_color_right=”#e8e8e8″ border_color_left=”#e8e8e8″ global_colors_info=”{}”][et_pb_row _builder_version=”4.3.2″ custom_margin=”|auto|-5px|151px||” custom_padding=”||7px|||” global_colors_info=”{}”][et_pb_column type=”4_4″ _builder_version=”4.3.2″ global_colors_info=”{}”][et_pb_text _builder_version=”4.4.1″ text_font=”|300|||||||” text_text_color=”#000000″ text_font_size=”18px” text_line_height=”1.8em” custom_margin=”||18px|||” global_colors_info=”{}”]

Naïve Bayes Classifier is machine learning model used to classify the object based on different features. The object or attribute that we are going to classify is also referred as dependent variable whereas the features that are used to predict the dependent variable is knows as independent variable (predictors).

Naïve Bayes classifier is a probabilistic model based on Bayes Theorem which states that:

[/et_pb_text][et_pb_image src=”https://verticecloud.com/wp-content/uploads/2020/04/NB1.png” align=”center” _builder_version=”4.4.1″ custom_margin=”28px|||||” global_colors_info=”{}”][/et_pb_image][/et_pb_column][/et_pb_row][/et_pb_section][et_pb_section fb_built=”1″ _builder_version=”3.29.3″ custom_margin=”-8px|||||” custom_padding=”6px||0px|||” global_colors_info=”{}”][et_pb_row _builder_version=”3.29.3″ custom_margin=”|auto|-13px|auto||” custom_padding=”8px||65px|||” global_colors_info=”{}”][et_pb_column type=”4_4″ _builder_version=”3.29.3″ global_colors_info=”{}”][et_pb_text _builder_version=”4.4.1″ text_font=”|300|||||||” text_text_color=”#000000″ text_font_size=”18px” text_line_height=”1.8em” custom_margin=”-4px||32px|||” custom_padding=”0px|||||” global_colors_info=”{}”]

Which means, calculating the probability of event y given that event X has already occurred. Here two assumptions are made to use it in Naïve Bayes.

The first assumption made here is that the predictors are independent from each other i.e. that one feature does not affect the presence of any other feature. That’s why it is called Naïve.

The second assumption made here is that the equal weight should be given to the predictors to make the prediction.

Let’s consider the below example to understand the implementation of theorem.

[/et_pb_text][et_pb_image src=”https://verticecloud.com/wp-content/uploads/2020/04/NB2.png” align=”center” _builder_version=”4.4.1″ max_width=”70%” custom_margin=”28px|||||” global_colors_info=”{}”][/et_pb_image][et_pb_text _builder_version=”4.4.1″ text_font=”|300|||||||” text_text_color=”#000000″ text_font_size=”18px” text_line_height=”1.8em” custom_margin=”-4px||32px|||” custom_padding=”0px|||||” global_colors_info=”{}”]

Here, we are trying to predict if the day is good for playing the golf or not given the features of the day. The row represents each day entry and columns represents the predictor features (except ‘PLAY GOLF’). Considering the first row, the day is not suitable for playing golf if the outlook is rainy, humidity is high, temperature is hot and it’s not windy. The first assumption for the Naïve Bayes is that the predictors are independent i.e. OUTLOOK is not dependent on TEMPRATURE, HUMIDITY, WINDY. Likewise, for other features. The second assumption is that the equal importance should be given to all the predictors.

From the above-mentioned Bayes Theorem, X can be considered as feature space and can be rewritten as below.

[/et_pb_text][et_pb_image src=”https://verticecloud.com/wp-content/uploads/2020/04/NB3.png” align=”center” _builder_version=”4.4.1″ max_width=”70%” custom_margin=”28px|||||” global_colors_info=”{}”][/et_pb_image][et_pb_text _builder_version=”4.4.1″ text_font=”|300|||||||” text_text_color=”#000000″ text_font_size=”18px” text_line_height=”1.8em” custom_margin=”-4px||32px|||” custom_padding=”0px|||||” global_colors_info=”{}”]

Where x1, x2, x3 …. xn are all the predictors (Here, OUTLOOK, TEMPRATURE, HUMIDITY, WINDY)

On substituting X into the original Bayes equation can be written as below-

[/et_pb_text][et_pb_image src=”https://verticecloud.com/wp-content/uploads/2020/04/NB4.3.png” align=”center” _builder_version=”4.4.1″ max_width=”70%” custom_margin=”28px|||||” global_colors_info=”{}”][/et_pb_image][et_pb_text _builder_version=”4.4.1″ text_font=”|300|||||||” text_text_color=”#000000″ text_font_size=”18px” text_line_height=”1.8em” custom_margin=”11px||31px|||” custom_padding=”0px|||||” global_colors_info=”{}”]

Now all the values from the dataset can be substituted in the above equation. The denominator will remain constant for the given data, so it can be further simplified as below.

[/et_pb_text][et_pb_image src=”https://verticecloud.com/wp-content/uploads/2020/04/NB5.png” align=”center” _builder_version=”4.4.1″ max_width=”70%” custom_margin=”28px||42px|||” global_colors_info=”{}”][/et_pb_image][et_pb_text _builder_version=”4.4.1″ text_font=”|300|||||||” text_text_color=”#000000″ text_font_size=”18px” text_line_height=”1.8em” custom_margin=”-4px||42px|||” custom_padding=”0px|||||” global_colors_info=”{}”]

In our case, the class y has two outcomes and we need to select the class with the maximum probability, which can be written as:

[/et_pb_text][et_pb_image src=”https://verticecloud.com/wp-content/uploads/2020/04/NB6.1.png” align=”center” _builder_version=”4.4.1″ max_width=”70%” custom_margin=”28px|||||” global_colors_info=”{}”][/et_pb_image][et_pb_text _builder_version=”4.4.1″ text_font=”|300|||||||” text_text_color=”#000000″ text_font_size=”18px” text_line_height=”1.8em” custom_margin=”11px||32px|||” custom_padding=”0px|||||” global_colors_info=”{}”]

This can be used to predict the class with the given features.

This algorithm is based on predictor independent assumption which hard to find the features in real world. It works well with multiclass prediction by providing the probability for each class. Due to these key points, this algorithm is mostly used in text classification / Spam filtering / Sentiment Analysis and have shown better results than other algorithms [1].

In Oracle, we can create Naïve Bayes Machine Learning model in Oracle Analytics Cloud as well as in ADW. OAC provides the drag and drop functionality and is quick to use for business users with less programming knowledge. Whereas ADW works directly on the database and is mostly used by IT team who has good knowledge of PL/SQL. In Oracle DB, Data Mining functions are defined to use different algorithms. To learn more on the usability of OAC and ADW, please refer to our video’s series.

[/et_pb_text][et_pb_text _builder_version=”4.4.1″ text_font=”|300|||||||” text_text_color=”#000000″ text_font_size=”18px” text_line_height=”1.8em” custom_margin=”-4px||32px|||” custom_padding=”0px|||||” global_colors_info=”{}”]

Sources

  1. G. Chauhan, “All About Naive Bayes,” 2018. [Online]. Available: https://towardsdatascience.com/all-about-naive-bayes-8e13cef044cf.
  2. “GeeksforGeeks,” [Online]. Available: https://www.geeksforgeeks.org/naive-bayes-classifiers/.

 

[/et_pb_text][/et_pb_column][/et_pb_row][/et_pb_section][et_pb_section fb_built=”1″ _builder_version=”3.22.7″ background_color=”#f4f4f4″ min_height=”229.8px” custom_margin=”-29px||-13px|||” custom_padding=”18px||22px|||” border_color_right=”#e8e8e8″ border_color_left=”#e8e8e8″ global_colors_info=”{}”][et_pb_row column_structure=”2_5,3_5″ _builder_version=”3.25″ custom_padding=”24px|||||” global_colors_info=”{}”][et_pb_column type=”2_5″ _builder_version=”3.25″ custom_padding=”|||” global_colors_info=”{}” custom_padding__hover=”|||”][et_pb_image src=”https://verticecloud.com/wp-content/uploads/2022/11/MicrosoftTeams-image.png” title_text=”MicrosoftTeams-image” align=”right” _builder_version=”4.11.3″ _module_preset=”default” max_width=”71%” custom_margin=”1px||||false|false” global_colors_info=”{}”][/et_pb_image][/et_pb_column][et_pb_column type=”3_5″ _builder_version=”3.25″ custom_padding=”|||” global_colors_info=”{}” custom_padding__hover=”|||”][et_pb_text _builder_version=”4.11.3″ text_font=”Poppins|300|||||||” text_text_color=”#000000″ text_font_size=”16px” text_line_height=”1.8em” header_font=”||||||||” header_2_font=”Poppins||||||||” header_2_text_color=”#094D8A” header_2_font_size=”33px” header_3_font=”Poppins|200|||||||” header_3_font_size=”16px” min_height=”54px” custom_margin=”-14px||1px|||” custom_padding=”6px||0px|||” global_colors_info=”{}”]

Follow Us on LinkedIn

[/et_pb_text][et_pb_text _builder_version=”4.11.3″ text_font=”Poppins||||||||” text_text_color=”#094d8a” text_font_size=”18px” text_line_height=”1.3em” header_font=”||||||||” header_3_font=”Poppins|200|||||||” header_3_font_size=”16px” min_height=”54px” custom_margin=”-9px||0px||false|false” custom_padding=”0px||||false|false” global_colors_info=”{}”]

For Regular Updates

[/et_pb_text][et_pb_social_media_follow use_icon_font_size=”on” icon_font_size=”34px” _builder_version=”4.11.3″ _module_preset=”default” text_orientation=”center” global_colors_info=”{}”][et_pb_social_media_follow_network social_network=”linkedin” url=”https://www.linkedin.com/company/vertice/” _builder_version=”4.11.3″ _module_preset=”default” background_color=”#007bb6″ global_colors_info=”{}” follow_button=”off” url_new_window=”on”]linkedin[/et_pb_social_media_follow_network][/et_pb_social_media_follow][/et_pb_column][/et_pb_row][/et_pb_section][et_pb_section fb_built=”1″ _builder_version=”3.25.3″ background_color=”#094d8a” custom_margin=”5px||-77px|||” custom_padding=”43px||25px|||” global_colors_info=”{}”][et_pb_row _builder_version=”3.25″ custom_margin=”|||” custom_padding=”0px||44px|||” animation_style=”flip” global_colors_info=”{}”][et_pb_column type=”4_4″ _builder_version=”3.25″ custom_padding=”|||” global_colors_info=”{}” custom_padding__hover=”|||”][et_pb_code _builder_version=”3.26.5″ custom_margin=”||3px|||” custom_padding=”0px|||||” global_colors_info=”{}”][/et_pb_code][et_pb_text _builder_version=”3.27.4″ header_line_height=”1.6em” header_2_text_color=”#ffffff” header_2_font_size=”30px” header_2_line_height=”1.6em” custom_padding=”0px|||||” global_colors_info=”{}”]

Sign Up For Our Newsletter

[/et_pb_text][et_pb_code _builder_version=”3.26.5″ global_colors_info=”{}”]

* indicates required

Please select all the ways you would like to hear from Vertice.

You can unsubscribe at any time by clicking the link in the footer of our emails. For information about our privacy practices, please visit our website.

We use Mailchimp as our marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp’s privacy practices here.

[/et_pb_code][et_pb_code _builder_version=”3.26.5″ global_colors_info=”{}”][/et_pb_code][/et_pb_column][/et_pb_row][/et_pb_section]