Azure Text Analytics filter plugin for Embulk

Azure Text Analytics filter plugin for Embulk.

Azure Text Analytics Documentation

Overview

  • Plugin type: filter

Configuration

  • api_type: api_type(string),
  • language: language(string, default: nil),
  • out_key_name: out_key_name(string),
  • key_name: key_name(string),
  • body_params: body_params(hash, default: {}),
  • params: params(hash, default: {}),
  • delay: delay(integer, default: 0),
  • per_request: per_request(integer, default: 1),
  • bulk_size: bulk_size(integer, default: 100),
  • subscription_key: subscription_key(string),

Example

sentiment

  # en,es,fr,pt
  - type: azure_text_analytics
    api_type: sentiment
    key_name: target_key
    out_key_name: target_key_sentiment
    language: en
    delay: 2
    subscription_key: XXXXXXXXXXXXXXXXXXXXXXXXXXX
  • sentiment support language
    • en
    • es
    • fr
    • pt

languages

  - type: azure_text_analytics
    api_type: languages
    out_key_name: target_key_languages
    language: en
    key_name: target_key
    delay: 2
    subscription_key: XXXXXXXXXXXXXXXXXXXXXXXXXXX

keyPhrases

  - type: azure_text_analytics
    api_type: keyPhrases
    out_key_name: target_key_keyPhrases
    key_name: target_key
    delay: 2
    subscription_key: XXXXXXXXXXXXXXXXXXXXXXXXXXX

keyPhrases

    # en,es,fr,pt
  - type: azure_text_analytics_topics
    out_key_name: _parsed
    key_name: pr
    params:
      minDocumentsPerWord: 3
      maxDocumentsPerWord: 10
    subscription_key: {{ env.AZURE_TEXT_SUBSCRIPTION_KEY }}

  • required, over 100 documents.

Build

$ rake