Extract Information Node

Overview #

The Extract Information node is a specialized AI node designed to extract specific information from unstructured text using advanced AI models. It allows users to define custom extraction fields and receive structured data output.

Key Features #

  • Custom field definition
  • Support for both single values and lists
  • Intelligent pattern recognition
  • Flexible data type handling
  • Format preservation
  • Context-aware extraction

Node Configuration #

Basic Setup #

  1. Node Addition:
   Workflow Builder → AI Actions → Extract Information
  1. Required Fields:
  • Node Name (optional, for identification)
  • Input Content (text to analyze)
  • Extraction Fields (what to extract)

Field Configuration #

Field Properties #

{
    "name": "field_name",
    "description": "What to extract",
    "type": "string|number|boolean|array",
    "isList": true|false
}

Supported Data Types #

  • string: Text content
  • number: Numerical values
  • boolean: True/false values
  • array: Lists of items

Usage Examples #

1. Basic Contact Information #

Field Configuration:
- Name: "email"
  Description: "Extract email addresses"
  Type: string

- Name: "phone"
  Description: "Extract phone numbers"
  Type: string

- Name: "addresses"
  Description: "Extract physical addresses"
  Type: array
  IsList: true

2. Product Information #

Field Configuration:
- Name: "product_name"
  Description: "Extract product name"
  Type: string

- Name: "price"
  Description: "Extract price in numbers"
  Type: number

- Name: "features"
  Description: "Extract product features"
  Type: array
  IsList: true

3. Document Analysis #

Field Configuration:
- Name: "dates"
  Description: "Extract all dates mentioned"
  Type: array
  IsList: true

- Name: "names"
  Description: "Extract person names"
  Type: array
  IsList: true

- Name: "key_points"
  Description: "Extract main points"
  Type: array
  IsList: true

Working Process #

1. Input Processing #

graph TD
    A[Input Text] --> B[Text Preprocessing]
    B --> C[Context Analysis]
    C --> D[Field Mapping]
    D --> E[Extraction Process]

2. Extraction Flow #

  1. Text Analysis
  • Content parsing
  • Structure identification
  • Pattern recognition
  1. Field Matching
  • Pattern matching
  • Context evaluation
  • Type validation
  1. Data Extraction
  • Value extraction
  • Type conversion
  • Format validation
  1. Output Formatting
  • Data structuring
  • Type enforcement
  • List processing

Output Format #

Standard Output Structure #

{
    "field_name1": "extracted_value",
    "field_name2": 123,
    "field_name3": ["item1", "item2", "item3"],
    "field_name4": true
}

Sample Response #

{
    "email": "john.doe@example.com",
    "phone": "+1-555-123-4567",
    "addresses": [
        "123 Main St, City, State 12345",
        "456 Side Ave, Town, State 67890"
    ]
}

Best Practices #

Field Definition #

  1. Clear Descriptions
  • Be specific about what to extract
  • Include format requirements
  • Specify any constraints
  1. Appropriate Types
  • Use correct data types
  • Consider list vs single value
  • Match expected format
  1. Naming Conventions
  • Use descriptive names
  • Maintain consistency
  • Avoid special characters

Input Preparation #

  1. Text Formatting
  • Clean input text
  • Remove irrelevant content
  • Maintain structure
  1. Content Organization
  • Group related information
  • Maintain context
  • Preserve relationships

Error Handling #

Common Issues and Solutions #

IssueCauseSolution
No Data ExtractedUnclear descriptionImprove field description
Wrong Data TypeType mismatchVerify field type configuration
Missing ValuesContent not foundCheck input text coverage
Invalid FormatFormat mismatchSpecify format requirements

Error Messages #

Error Types:
- FIELD_NOT_FOUND: Required field not found in text
- TYPE_MISMATCH: Extracted data doesn't match specified type
- FORMAT_ERROR: Data format validation failed
- EXTRACTION_FAILED: General extraction failure

Performance Optimization #

Best Practices #

  1. Input Optimization
  • Limit text length
  • Remove unnecessary content
  • Maintain relevant context
  1. Field Configuration
  • Limit number of fields
  • Use specific descriptions
  • Optimize field types
  1. Processing Efficiency
  • Group similar extractions
  • Use appropriate models
  • Cache common patterns

Integration Examples #

1. Form Processing #

Workflow:
Form Submission → Extract Information → Database Storage
Fields:
- Personal Information
- Contact Details
- Requirements

2. Document Analysis #

Workflow:
Document Upload → Text Extraction → Extract Information → Report Generation
Fields:
- Key Terms
- Important Dates
- Action Items

3. Email Processing #

Workflow:
Email Receipt → Extract Information → CRM Update
Fields:
- Customer Details
- Order Information
- Support Requirements

Troubleshooting Guide #

Diagnostic Steps #

  1. Verify input text quality
  2. Check field configurations
  3. Validate data types
  4. Review extraction patterns
  5. Check model responses

Common Solutions #

  1. No Data Extracted
  • Improve field descriptions
  • Check input text
  • Verify field names
  1. Wrong Data
  • Review field types
  • Check format specifications
  • Validate input content
  1. Performance Issues
  • Optimize input length
  • Reduce field count
  • Improve descriptions

Additional Resources #

Documentation #

  • Field configuration guide
  • Data type reference
  • Pattern matching guide
  • Best practices guide

Support #

  • Community forums
  • Technical support
  • Usage examples
  • FAQ section

Remember to regularly test your extraction configurations and validate the output to ensure accurate and reliable data extraction.

What are your feelings
Updated on October 29, 2024