Skip to content

Commit

Permalink
Merge pull request #115 from lenisha/master
Browse files Browse the repository at this point in the history
Skills and samples updates
  • Loading branch information
gmndrg authored Nov 14, 2023
2 parents 752ba2a + da0fe27 commit 1928aaa
Show file tree
Hide file tree
Showing 45 changed files with 2,490 additions and 295 deletions.
71 changes: 71 additions & 0 deletions 01 - Search Index Creation/01.1 - BuiltIn Skills/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Adding Built In Skill to the Skillset

Add Sentiment Analysis Skill to the Skillset and verify that sentiment are generated and stored in the index.

Use https://learn.microsoft.com/en-us/azure/search/cognitive-search-skill-sentiment-v3 as reference for Skill inputs and outputs


- Add field `sentiment` to index
```json
{
"name": "sentiment",
"type": "Edm.String",
"searchable": true,
"sortable": true,
"filterable": true,
"facetable": true
}
```

- Add `"#Microsoft.Skills.Text.V3.SentimentSkill` to skillset
```json
{
"@odata.type": "#Microsoft.Skills.Text.V3.SentimentSkill",
"name": "sentiment",
"description": "",
"context": "/document",
"defaultLanguageCode": "en",
"modelVersion": "",
"includeOpinionMining": true,
"inputs": [
{
"name": "text",
"source": "/document/merged_text"
}
],
"outputs": [
{
"name": "sentiment",
"targetName": "sentiment"
},
{
"name": "confidenceScores",
"targetName": "confidenceScores"
},
{
"name": "sentences",
"targetName": "sentences"
}
]
}
```

- Update Indexer to add output mappings between skill output and index field

```json
{
"sourceFieldName": "/document/sentiment",
"targetFieldName": "sentiment"
}
```

**Refer** to Postman collection for more details


# Verify Index data

- Search for all docments that have 'GitHub` word in them sorting by sentiment

- Search all document and show sentiment and locations facets

- Search documents that have location in Europe

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,11 @@
"key": "cog_services_key",
"value": "",
"enabled": true
},
{
"key": "env_function_url",
"value": "",
"enabled": true
}
],
"_postman_variable_scope": "environment",
Expand Down
5 changes: 5 additions & 0 deletions 01 - Search Index Creation/Create-Index-Postman.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,11 @@ We recommend using this collection to create an initial index and then iterating

You can then *check the indexer status* to see if documents are processing or if there are any errors. If the indexer does not start running automatically, you can run the indexer manually.

## Verify Index

Use search explorer or postment to search data


## Additional Resources

For more help working with Postman, see the [documentation](https://learning.postman.com/docs/getting-started/introduction/) on the Postman website.
4 changes: 3 additions & 1 deletion 01 - Search Index Creation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,6 @@ This folder includes three options for creating an index. Each of these approach

1. [Create a search index using the Azure Portal](./Create-Index-AzurePortal.md)
2. [Create a search index using PowerShell](./Create-Index-PowerShell.md)
3. [Create a search index using Postman](./Create-Index-Postman.md)
3. [Create a search index using Postman](./Create-Index-Postman.md)

4. Optionally - go thru Sentiment Analysis setup example in [01.1 - BuiltIn Skills](./01.1%20-%20BuiltIn%20Skills/)
7 changes: 5 additions & 2 deletions 02 - Web UI Template/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,9 +86,12 @@ docker run -d --env-file .env -p 80:80 kmworkshop.azurecr.io/web-ui:latest

1. Visual Studio 2019 or newer - [Download](https://visualstudio.microsoft.com/downloads/)

## 1. Update appsettings.json
## 1. Update appsettings configuration

To configure your web app to connect to your Azure services, simply update the *appsettings.json* file.
To configure your web app to connect to your Azure services, update the *appsettings.json* file and rebuild container.

Or update web app configuration:
![](../images/appsettings.png)

This file contains a mix of required and optional fields described below.

Expand Down
103 changes: 103 additions & 0 deletions 03 - Data Science and Custom Skills/FormRecognizer Skill/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@

# Form Recognizer Custom Skill

Follow MS Learn module [Build a Form Recognizer custom skill for Azure Cognitive Search ](https://learn.microsoft.com/en-us/training/modules/build-form-recognizer-custom-skill-for-azure-cognitive-search/4-exercise-build-deploy)
to create Form Recognizer service and deploy Azure Function using cloud shell.

Integrate a Form Recognizer Pre-Built Model for Invoices capability within the Cognitive Search pipeline

# AnalyzeInvoice

This custom skill extracts invoice specific fields using a pre trained forms recognizer model.


## Settings

This Azure function requires access to an [Azure Forms Recognizer](https://azure.microsoft.com/en-us/services/cognitive-services/form-recognizer/) resource. The [prebuilt invoice model](https://docs.microsoft.com/azure/cognitive-services/form-recognizer/concept-invoices) is available in the 2.1 preview API.


This function requires a `FORMS_RECOGNIZER_ENDPOINT` and a `FORMS_RECOGNIZER_KEY` settings set to a valid Azure Forms Recognizer API key and to your custom Form Recognizer 2.1-preview endpoint.



## Sample Input:

This sample data is pointing to a file stored in this repository, but when the skill is integrated in a skillset, the URL and token will be provided by cognitive search.

```json
{
"values": [
{
"recordId": "record1",
"data": {
"formUrl": "https://github.com/Azure-Samples/azure-search-power-skills/raw/master/SampleData/Invoice_4.pdf",
"formSasToken": "?st=sasTokenThatWillBeGeneratedByCognitiveSearch"
}
}
]
}
```

## Sample Output:

```json
{
"values": [
{
"recordId": "0",
"data": {
"invoices": [
{
"AmountDue": 63.0,
"BillingAddress": "345 North St NY 98052",
"BillingAddressRecipient": "Fabrikam, Inc.",
"DueDate": "2018-05-31",
"InvoiceDate": "2018-05-15",
"InvoiceId": "1785443",
"InvoiceTotal": 56.28,
"VendorAddress": "4567 Main St Buffalo NY 90852",
"SubTotal": 49.3,
"TotalTax": 0.99
}
]
}
}
]
}
```

## Sample Skillset Integration

In order to use this skill in a cognitive search pipeline, you'll need to add a skill definition to your skillset.
Here's a sample skill definition for this example (inputs and outputs should be updated to reflect your particular scenario and skillset environment):

```json
{
"@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"name": "formrecognizer",
"description": "Extracts fields from a form using a pre-trained form recognition model",
"uri": "[AzureFunctionEndpointUrl]/api/AnalyzeInvoice?code=[AzureFunctionDefaultHostKey]",
"httpMethod": "POST",
"timeout": "PT1M",
"context": "/document",
"batchSize": 1,
"inputs": [
{
"name": "formUrl",
"source": "/document/metadata_storage_path"
},
{
"name": "formSasToken",
"source": "/document/metadata_storage_sas_token"
}
],
"outputs": [
{
"name": "invoices",
"targetName": "invoices"
}
]
}
```

Refer to Postman Collection for more details.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
.git*
.vscode
local.settings.json
test
.venv
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don’t work, or not
# install all needed dependencies.
#Pipfile.lock

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# Azure Functions artifacts
bin
obj
appsettings.json
local.settings.json
.python_packages
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"recommendations": [
"ms-azuretools.vscode-azurefunctions",
"ms-python.python"
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"version": "0.2.0",
"configurations": [

{
"name": "Attach to Python Functions",
"type": "python",
"request": "attach",
"port": 9091,
"preLaunchTask": "func: host start"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"azureFunctions.deploySubpath": ".",
"azureFunctions.scmDoBuildDuringDeployment": true,
"azureFunctions.pythonVenv": ".venv",
"azureFunctions.projectLanguage": "Python",
"azureFunctions.projectRuntime": "~2",
"debug.internalConsoleOptions": "neverOpen"
}
Loading

0 comments on commit 1928aaa

Please sign in to comment.