Introduction This post will compare vectorizing word data using term frequency-inverse document frequency (TF-IDF) in several python implementations. TF-IDF is...
Introduction This post will compare vectorizing word data using term frequency-inverse document frequency (TF-IDF) in several python implementations. TF-IDF is used in the natural language processing (NLP) area of artificial intelligence to determine the importance of words in a document and collection of documents, A.K.A. corpus. Various implementations of TF-IDF were tested in python to gauge how they would perform against a large set of data. Tested were sklearn, gensim...
Introduction This post will compare vectorizing word data using term frequency-inverse document frequency (TF-IDF) in several python implementations. TF-IDF is used in the natural language processing...
p.notebox-linen {font-size: MEDIUM; border: 1px solid brown; border-top-left-radius: 10px; border-top-right-radius: 10px; border-bottom-right-radius: 10px;...
p.notebox-linen {font-size: MEDIUM; border: 1px solid brown; border-top-left-radius: 10px; border-top-right-radius: 10px; border-bottom-right-radius: 10px; border-bottom-left-radius: 10px; padding: 10px; background-color: linen; } Note: Private Access Channel is now available in Oracle Analytics and is recommended by Oracle for new connections to private data sources. For more information on the feature and the data sources it supports refer to: Connect to Private Data...
p.notebox-linen {font-size: MEDIUM; border: 1px solid brown; border-top-left-radius: 10px; border-top-right-radius: 10px; border-bottom-right-radius: 10px; border-bottom-left-radius: 10px; padding:...
Introduction In the course of prepping some data for a machine learning activity using natural language processing (NLP), several methods were used to compare...
Introduction In the course of prepping some data for a machine learning activity using natural language processing (NLP), several methods were used to compare the performance and volume of data that could be efficiently processed. This post will show the performance of cleaning a small set, and a larger set of data. All examples are in python, and compare the use of Pandas dataframes, Dask dataframes, and Apache Spark (pyspark). Test environments The small dataset was...
Introduction In the course of prepping some data for a machine learning activity using natural language processing (NLP), several methods were used to compare the performance and volume of data that...
Retired This post has been retired as OAC now connects to Essbase 19c via remote data gateway and private access channel. For other A-Team posts visit A-Team...
Retired This post has been retired as OAC now connects to Essbase 19c via remote data gateway and private access channel. For other A-Team posts visit A-Team Data Integration and Analytics
This is the seventh in a series of blogs describing how one would go about modeling an enterprise ontology. In my last two blogs I showed how one could...
This is the seventh in a series of blogs describing how one would go about modeling an enterprise ontology. In my last two blogs I showed how one could relatively easily create an informal taxonomy and add facets to it. In this blog we show how to model a formal taxonomy using the gist upper-level ontology to bootstrap our micro-pattern. From the gist definition of taxonomy we have the following: gist:taxonomy: A controlled vocabulary arranged as a hierarchy of concepts....
This is the seventh in a series of blogs describing how one would go about modeling an enterprise ontology. In my last two blogs I showed how one could relatively easily create an informal...
Content validated on 12/18/2020 with ODI Version 12.2.1.4.200304.2238 ATP Version Oracle Database 19c Enterprise Edition Release - Production Version...
Content validated on 12/18/2020 with ODI Version 12.2.1.4.200304.2238 ATP Version Oracle Database 19c Enterprise Edition Release - Production Version 19.5.0.0.0 Background This blog walks though how to configure Oracle Data Integrator (ODI) on Oracle Cloud Marketplace Agents in High Availability (HA) Mode with the Autonomous Transaction Processing (ATP) Database to hold the ODI repository. To allow for a true application HA, the agents configured throughout this blog will...
Content validated on 12/18/2020 with ODI Version 12.2.1.4.200304.2238 ATP Version Oracle Database 19c Enterprise Edition Release - Production Version 19.5.0.0.0 Background This blog walks though how...
This is the sixth in a series of blogs describing how one would go about modeling an enterprise ontology. In my last post I showed how one would create an...
This is the sixth in a series of blogs describing how one would go about modeling an enterprise ontology. In my last post I showed how one would create an informal taxonomy using the gist upper-level ontology. To keep the previous blog from becoming hideously complex, I chose to not document any facets there. However, facets are something that are clearly needed if your informal taxonomy is to have any value in the real-world. To that end, I am building upon the previous...
This is the sixth in a series of blogs describing how one would go about modeling an enterprise ontology. In my last post I showed how one would create an informal taxonomy using the gist...