As part of Augmented Analytics, we can use natural language and machine learning to assist with data preparation, insight generation, and insight explanation within Oracle Analytics Cloud (OAC) and Oracle Analytics Server (OAS).
There are five elements that we should investigate for augmented analytics in Oracle Analytics:
- Natural language and voice-activated search
- Natural language generation
- Data enrichment
- One click “Explain”
- Machine learning for predictive analytics
Natural language and voice-activated search
Find and Explore Your Content:
From the Oracle Analytics Cloud (OAC) home page we can easily find our analytics content, such as projects, data flows, data sets, and connections.
1. Navigator & Search Bar: On the home page, we can use the navigator or search bar to find the content we are interested in. The navigator helps us to quickly access our content.
2. Sort by/Grid/List: We can organise our content by using the display option to sort or change how the content is displayed.
3. Customize Home Page: We can customize the page by clicking on the page menu and then customise the home page.
Natural language generation
Natural language generation (NLG) is the ability to provide a verbal description of what has happened. This is also called “language out”. In oracle analytics we can summarize the meaningful information into text by using the concept knows as “grammar of graphics”. We can do the narrations out of the box with any oracle analytics visualization.
Data Profiles and Semantic Recommendations:
After we create the data set, the data set undergoes column level profiling to produce a set of semantic recommendations to repair and enrich our data. During the profile step system automatically detects a specific semantic and gives us the recommendations.
There are various categories of semantic types such as geographic locations which are identified by the city names.
- Semantic Type Catagories: Profiling is applied to various semantic types. Such as geographic locations like city names and patterns such as those found with credit card numbers or email addresses.
- Semantic Type Recommendations: Recommendations to repair, enhance, or enrich the data set, are determined by the type of data. For example, Enrichments, Column Concatenations, Date Extractions, etc.
- Recognized Pattern-Based Semantic Types: Based on patterns found in the data the semantic types are identified. For example, Email Addresses, Dates, etc.
- Reference-Based Semantic Types: Semantic types are identified by loaded reference knowledge provided with the service. For example, Zip codes, County names, Country names, etc.
- Recommended Enrichments: Recommendation enrichments are based on semantic types. For example, Geographical location hierarchy like Population, Latitude, Longitude, etc.
- Required Thresholds: The profiling process uses specific thresholds to make decisions about specific semantic types. As a general rule, 85% of the data values in the column must meet the criteria for a single semantic type for the system to make the classification determination. As a result, a column that might contain 70% first names and 30% “other”, doesn’t meet the threshold requirements and therefore no recommendations are made.
One click “Explain”
Analyse Data with Explain:
We can find useful insights about our data using the explain feature which uses oracle’s machine learning. Explain analyse the selected column within the context of its data set and generates text descriptions about the insights it finds. The insight types are Basic Facts, Key Drivers, Segments, and Anomalies.
- Basic Facts: Displays the basic distribution of the column’s values.
- Key Drivers: This shows the columns in the data set that have the highest degree of correlation with the selected column outcome.
- Segments: Displays the key segments (or groups) from the column values.
- Anomalies: Identifies a series of values where one of the (aggregated) values deviates substantially from what the regression algorithms expect.
Machine learning for predictive analytics
Train and Apply Oracle Analytics Predictive Models:
- To mine our data set, to predict a target value, or to identify classes of records we can use oracle analytics predictive models which use several embedded machine learning algorithms. The data flow editor helps us to create, train, and apply oracle analytics predictive models.
- For any of our machine learning modelling needs oracle analytics provides us the algorithms: numeric prediction, multi-classifier, binary classifier, and clustering.
- The typical workflow to create and use oracle analytics predictive models is as below:
Train a model using sample data -> Evaluate a model -> Apply a model to your data using a data flow -> Apply a predictive model to your project data
Shalini Mahajan is a Senior Data Analytics Consultant at Vertice. She has over 6 years of experience working with database management and data analytics. She began her career as an Application Software Development Consultant at NTT Data after she finished a Bachelor of Engineering degree in ECE from Visvesvaraya Technological University. She is very passionate about analytics and business service innovation. She holds a Master’s Degree in Business Analytics from UCD Michael Smurfit Graduate Business School.