{"id":15801,"date":"2019-05-29T18:56:03","date_gmt":"2019-05-30T01:56:03","guid":{"rendered":"https:\/\/devwww.3cloudsolutions.com\/post\/knowledge-mining-showcase-azure-search-2\/"},"modified":"2024-04-08T13:51:08","modified_gmt":"2024-04-08T20:51:08","slug":"knowledge-mining-showcase-azure-search","status":"publish","type":"post","link":"https:\/\/3cloudsolutions.com\/resources\/knowledge-mining-showcase-azure-search\/","title":{"rendered":"Knowledge Mining Showcase: Azure Search"},"content":{"rendered":"<p class=\"p1\">Welcome to the first installment of an in-depth look at Knowledge Mining \u2013<span class=\"Apple-converted-space\">\u00a0 <\/span>the ability to use <a href=\"https:\/\/azure.microsoft.com\/en-us\/\"><span class=\"s1\">Microsoft Azure\u2019s<\/span><\/a> advanced AI <a href=\"https:\/\/azure.microsoft.com\/en-us\/services\/search\/\"><span class=\"s1\">Search<\/span><\/a> capabilities to comb through all of your data (PDFs, emails, scanned documents, images, etc.) to glean insight. In this series, I\u2019m going to take you through Microsoft\u2019s Azure Cognitive Search (Azure Search with human-like reasoning capabilities) and show you the ins and outs of effectively using this awesome tool to uncover insight from enterprise data, whether structured or raw.<\/p>\n<p><!--more--><\/p>\n<p class=\"p3\">Let\u2019s begin by digging in to Azure Search.<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"background-attachment: scroll; background-clip: border-box; background-color: transparent; background-image: none; background-origin: padding-box; background-position-x: 0%; background-position-y: 0%; background-repeat: repeat; background-size: auto; border-image-outset: 0; border-image-repeat: stretch; border-image-slice: 100%; border-image-source: none; border-image-width: 1; box-sizing: border-box; color: #36363e; cursor: pointer; font-family: ' open sans',sans-serif; font-size: 17px; font-style: normal; font-variant: normal; font-weight: 400; height: 513.18px; letter-spacing: normal; max-width: 847.59px; orphans: 2; outline-color: #00a4bd; outline-style: solid; outline-width: 1px; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; transition-delay: 0s; transition-duration: 0s; transition-property: none; transition-timing-function: cubic-bezier(0.25, 0.1, 0.25, 1); vertical-align: bottom; -webkit-text-stroke-width: 0px; white-space: normal; width: 847.59px; word-spacing: 0px; border: 0px none #36363e;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/iStock-knowledge-mining.jpg\" alt=\"knowledge-mining\" width=\"1024\" \/><i><\/i><u><\/u><\/p>\n<h2 class=\"p4\">Azure Search<\/h2>\n<p class=\"p3\">Managed by Microsoft, the intelligent Azure Search cloud solution-as-a-service has built-in cognitive abilities. Recognizing and extracting text and identity from images; highlighting key talking points from text; and the power to recognize and classify people, places, and things from text <i>and<\/i> images, are among its innovative capabilities.<\/p>\n<p class=\"p3\">This expansive offering can help dig deep into your organization\u2019s data, often uncovering rich insight. 3Cloud is currently working on one such project for a private global energy exploration and engineering company. Leaders here needed a way look back at decades of data, recorded in paper files, without a team of archivists. We\u2019re working with the worldwide energy giant, using cloud-scale technology, to digitize, store, and offer deep search capabilities on more than six million documents. By combining Azure Search with Knowledge Mining, we will ultimately provide nearly instantaneous access to information that once would have taken a team of people potentially months (if ever) to uncover.<\/p>\n<p class=\"p3\">So, what is Knowledge Mining?<\/p>\n<h2 class=\"p4\">Knowledge Mining<\/h2>\n<p class=\"p3\">Knowledge Mining is a cognitive search-based technique of extracting facts from unstructured data. It\u2019s like having a crew of experts comb through your most important documents to discover and leverage data to drive your enterprise. This content comprehension capability can be used to create in-depth search resources that inform an organization\u2019s employees and enrich its clients and customers.<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"width: 974px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Knowledge_Mining_Solution.png\" alt=\"Knowledge_Mining_Solution\" width=\"974\" \/><\/p>\n<p class=\"p3\">In today\u2019s tutorial, I\u2019ll offer step-by-step instructions on creating a search service and an index.<\/p>\n<h2 class=\"p4\">Creating an Azure Search Service and Index<\/h2>\n<p class=\"p3\">The first step in creating an effective search is providing the data that we are going to search. Azure Search can be used against several data sources, both structured (Azure SQL Database and Cosmos DB) and unstructured, in the form of Azure Blob Storage. Blob indexers can extract text from major file formats such as Microsoft Office, PDF, and HTML documents.<\/p>\n<p class=\"p3\">In this how-to series on Knowledge Mining, we\u2019ll be focusing on unstructured data, so let\u2019s create some blob storage.<\/p>\n<p class=\"p3\">First, we need to create a storage account to hold our blobs. From the Azure Portal, select the <strong>Create a resource <\/strong>option and then type <strong>storage account<\/strong> into the search box. Select <strong>Storage account<\/strong> from the drop-down under the search box.<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"width: 927px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Azure_Marketplace.png\" alt=\"Azure_Marketplace\" width=\"927\" \/><\/p>\n<p class=\"p3\">Click on the <strong>Create<\/strong> button.<\/p>\n<p class=\"p3\">\u00a0<img decoding=\"async\" style=\"width: 883px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Storage_Account.png\" alt=\"Storage_Account\" width=\"883\" \/><\/p>\n<p class=\"p3\">And now add the details:<\/p>\n<ol>\n<li>Select your\u00a0<strong>Subscription<\/strong> and <strong>Resource group <\/strong>that you are using for your search.<\/li>\n<li>Enter a <strong>Storage account name<\/strong>. Note that this name must be unique across all storage account names.<\/li>\n<li>Select a <strong>Location<\/strong> that you will use for creating all of your resources throughout this exercise.<\/li>\n<li>Make the following selections:\n<ol>\n<li><strong>Performance<\/strong>: Standard<\/li>\n<li><strong>Account kind<\/strong>: Storage (general purpose v1)<\/li>\n<li><strong>Replication<\/strong>: Locally-redundant storage (LRS)<\/li>\n<\/ol>\n<\/li>\n<li>Click on the <strong>Next: Advanced&gt;<\/strong> button.<\/li>\n<\/ol>\n<p><img decoding=\"async\" style=\"width: 983px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Create_Storage_Account_Basics.png\" alt=\"Create_Storage_Account_Basics\" width=\"983\" \/><\/p>\n<p class=\"p3\">On the next page, make the selections shown below and then click on the <strong>Review + create<\/strong> button.<\/p>\n<p class=\"p3\">\u00a0<img decoding=\"async\" style=\"width: 941px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Create_Storage_Account_Advanced.png\" alt=\"Create_Storage_Account_Advanced\" width=\"941\" \/><\/p>\n<p class=\"p3\">Now press the <strong>Create<\/strong> button and wait for the deployment of your new storage account to complete.<\/p>\n<p class=\"p3\">\u00a0<img decoding=\"async\" style=\"width: 899px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Create_Storage_Account_Review.png\" alt=\"Create_Storage_Account_Review\" width=\"899\" \/><\/p>\n<p class=\"p3\">When your deployment is complete, you will see a link to your new storage account at the bottom of the page. Click on it to move to the next step.<\/p>\n<p class=\"p3\">\u00a0<img decoding=\"async\" style=\"width: 945px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Deployment_Complete.png\" alt=\"Deployment_Complete\" width=\"945\" \/><\/p>\n<p class=\"p3\">You\u2019ll now see the <strong>Storage Account Overview<\/strong> page. We will create our searchable blob storage in this storage account by clicking on the <strong>Blobs <\/strong>link in the middle of the page.<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"width: 933px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Storage_Account_Overview.png\" alt=\"Storage_Account_Overview\" width=\"933\" \/><\/p>\n<p class=\"p3\">Click on the big plus sign <strong>+ Container<\/strong> in the upper left of your screen and enter a name for your container. Think of a container as a folder that will hold all of our documents.<\/p>\n<p class=\"p3\">Select <strong>Container (anonymous read access for containers and blobs)<\/strong> for your <strong>Public access level<\/strong> and click on the <strong>OK<\/strong> button.<\/p>\n<p class=\"p3\">\u00a0<img decoding=\"async\" style=\"width: 591px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/New_Container.png\" alt=\"New_Container\" width=\"591\" \/><\/p>\n<p class=\"p3\">Click on the name of your new container.<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"width: 1016px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/New_Container_Name.png\" alt=\"New_Container_Name\" width=\"1016\" \/><\/p>\n<p class=\"p3\">You should now see the <strong>Overview<\/strong> of your blob container. Now we need to add the files that we are going to be searching. We do this by clicking on the <strong>Upload <\/strong>option in the upper left of the screen.<\/p>\n<p class=\"p3\">\u00a0<img decoding=\"async\" style=\"width: 802px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Blob_Overview.png\" alt=\"Blob_Overview\" width=\"802\" \/><\/p>\n<p class=\"p3\">You\u2019ll see this on the right side of your screen. For this series, we\u2019ll be working with a common data set used in Microsoft\u2019s Knowledge Mining Bootcamp, which is an excellent introduction to Azure Search as well. You can go to the Bootcamp. To get the dataset, go to the GitHub repository at <span class=\"s1\">https:\/\/github.com\/Azure\/LearnAI-KnowledgeMiningBootcamp.git<\/span> and clone the repository. You\u2019ll find the sample data in the <strong>resources\/dataset\/<\/strong> folder.<\/p>\n<p class=\"p3\">\u00a0<img decoding=\"async\" style=\"width: 974px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Select_Dataset_Files.png\" alt=\"Select_Dataset_Files\" width=\"974\" \/><\/p>\n<p class=\"p3\">From the Azure Portal, click on the selection button and select all of the files in the dataset folder.<\/p>\n<p class=\"p3\">\u00a0<img decoding=\"async\" style=\"width: 472px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Upload_Blob.png\" alt=\"Upload_Blob\" width=\"472\" \/><\/p>\n<p class=\"p3\">Now we are ready to create a search service and index all of the documents we just uploaded.<\/p>\n<p class=\"p3\">Create a new Azure Search Service by clicking on the <strong>Create a resource<\/strong> menu item in Azure Portal and then type <strong>Azure Search<\/strong> into the search box. Select <strong>Azure Search<\/strong> from the drop-down under the search box.<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"width: 839px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/New_Azure_Search_Service.png\" alt=\"New_Azure_Search_Service\" width=\"839\" \/><\/p>\n<p class=\"p3\">Click on the <strong>Create <\/strong>button.<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"width: 756px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Azure_Search_Create.png\" alt=\"Azure_Search_Create\" width=\"756\" \/><\/p>\n<p class=\"p3\">Enter a name for your search service, the subscription and resource group you want your searching to be done in, and the location. You will want these to all match the setting on the storage account that you created earlier.<span class=\"Apple-converted-space\">\u00a0 <\/span>You can use the <strong>Free<\/strong> pricing tier for these exercises but understand that it is very limited and should only be used for dev\/test.<\/p>\n<p class=\"p3\">Click the <strong>Create<\/strong> button and wait for your new Search Service to be created.<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"width: 387px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/New_Search_Service.png\" alt=\"New_Search_Service\" width=\"387\" \/><\/p>\n<p class=\"p3\">Now that the search service is created, you should see something similar to this:<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"width: 904px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Dashboard_Overview.png\" alt=\"Dashboard_Overview\" width=\"904\" \/><\/p>\n<p class=\"p3\">Next, we\u2019ll create the index that will hold all of the information for our search. Click on the <strong>Import data<\/strong> link on the top of the page. On the next screen, select <strong>Azure Blob Storage<\/strong> from the Data Source drop-down list.<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"width: 974px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Import_Data.png\" alt=\"Import_Data\" width=\"974\" \/><\/p>\n<p class=\"p3\">Name your data source and then click on <strong>Choose an existing connection<\/strong> and select the storage account that you created earlier. You will then select the container that you created and press the <strong>Select<\/strong> button. Leave all other fields at their default values and press the <strong>Next: Add cognitive search (Optional)<\/strong> button on the bottom of the screen. Azure Search will try to infer index fields from the files in your storage account. Since we are working with unstructured data, it will only come back with standard search fields.<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"width: 852px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Connect_Data.png\" alt=\"Connect_Data\" width=\"852\" \/><\/p>\n<p class=\"p3\">We\u2019re going to skip the cognitive search settings for now (much more on this in future posts). Press the <strong>Skip to: Customize target index<\/strong> button.<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"width: 762px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Add_Cognitive_Search.png\" alt=\"Add_Cognitive_Search\" width=\"762\" \/><\/p>\n<p class=\"p3\">We\u2019ll leave all of the settings at their default value for now. Press on the <strong>Next: Create an indexer<\/strong> button.<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"width: 906px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Customize_Target_Index.png\" alt=\"Customize_Target_Index\" width=\"906\" \/><\/p>\n<p class=\"p3\">Change the <strong>Schedule<\/strong> value to <strong>Once<\/strong>. You can leave all of the other fields at their default value and press on the <strong>Submit<\/strong> button at the bottom of the screen.<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"width: 814px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Create_Indexer.png\" alt=\"Create_Indexer\" width=\"814\" \/><\/p>\n<p class=\"p3\">When the index creation is complete, it will take you back to the Search Service Overview page. Notice that the Index, Indexers, and Data sources menu items in the middle of the page all have a (1) next to them showing the number of elements in each area.<\/p>\n<p class=\"p3\">Congratulations! You\u2019ve just built your first searchable index.<\/p>\n<p class=\"p3\">Click on the <strong>Indexers(1)<\/strong> link in the middle of the page. You\u2019ll see that the status is set to Warning. Let\u2019s look into that by clicking on the line for your indexer.<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"width: 974px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Indexers.png\" alt=\"Indexers\" width=\"974\" \/><\/p>\n<p class=\"p3\">Click on the line in the <strong>Execution details<\/strong> section for the indexer we just ran. In the new blade, you\u2019ll see several entries in the <strong>Warnings<\/strong> section. There are two kinds of warnings: 1) Document has unsupported content type, and 2) Truncated extracted text to 32768 characters. The first warning is because several of our files contain images. The standard Indexer does index images. We have to add a cognitive service for image cracking. We\u2019ll do this in our next session. The second warning is because we selected the <strong>Free<\/strong> pricing tier when we created our search service. The Free pricing tier only allows a maximum of 32,768 characters to be extracted out of a document.<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"width: 974px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Indexer_Summary.png\" alt=\"Indexer_Summary\" width=\"974\" \/><\/p>\n<p class=\"p3\">Go back to the Search Service Overview page and click on the <strong>Indexes(1)<\/strong> link and then again on the line with the index that you just created. From the Index Screen we can query a selected index and test it out. Try it out. Type \u201cMicrosoft\u201d in the <strong>Query String<\/strong> field and click <strong>Search<\/strong>. You\u2019ll see the results returned in JSON. Using the dataset from the Knowledge Mining Bootcamp, you\u2019ll see 10 documents returned.<\/p>\n<p class=\"p3\"><img decoding=\"async\" style=\"width: 974px;\" src=\"https:\/\/cdn2.hubspot.net\/hubfs\/257922\/Knowledge%20Mining%20Showcase%20-%20Azure%20Search\/Search_Explorer.png\" alt=\"Search_Explorer\" width=\"974\" \/><\/p>\n<h2 class=\"p4\">More to Come<\/h2>\n<p class=\"p3\">In the coming weeks I\u2019ll be exploring the many ways to use Azure Cognitive Search to more easily unearth knowledge from once-difficult-to-mine data sources.<span class=\"s3\"> Be sure to\u00a0<span class=\"s4\">subscribe to our blog<\/span>\u00a0so that you don&#8217;t miss a tutorial, or <a href=\"\/get-started\/\"><span class=\"s1\">contact us<\/span> today<\/a> to discover the many ways 3Cloud can make the most of your data!<\/span><\/p>\n<p class=\"p3\"><span style=\"font-size: 12px;\"><em>Editor&#8217;s note: This post was edited 10\/2020 to reflect system updates.<\/em><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Overview of Azure Search for knowledge mining, recognizing and extracting text and identity from images<\/p>\n","protected":false},"author":21,"featured_media":13938,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":""},"categories":[260],"tags":[330,319],"class_list":["post-15801","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-ai","tag-knowledge-mining","tag-machine-learning-ai","topics-blog"],"acf":[],"_links":{"self":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts\/15801","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/users\/21"}],"replies":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/comments?post=15801"}],"version-history":[{"count":0,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts\/15801\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/media\/13938"}],"wp:attachment":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/media?parent=15801"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/categories?post=15801"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/tags?post=15801"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}