Troubleshooting Guide

Common issues and solutions for MeshOptixIQ deployment and operation.

License Issues

Error: "No license key found"

Cause: License key not configured in environment or file system.

Solutions:

  1. Set environment variable (recommended for Docker):
    export MESHOPTIXIQ_LICENSE_KEY="mq-prod-xxxxxxxxxx"
    meshq version
  2. Create license file (recommended for persistent installations):
    mkdir -p ~/.meshoptixiq
    echo "mq-prod-xxxxxxxxxx" > ~/.meshoptixiq/license.key
    chmod 600 ~/.meshoptixiq/license.key
    meshq version

Error: "License has expired"

Cause: Your license expiration date has passed.

Solutions:

  1. Renew your license at: https://meshoptixiq.com/renew
  2. Update your license key using one of the methods above
  3. Restart the application or container

Warning: "Running in grace period"

Cause: Cannot reach license validation server (network connectivity issue).

Impact: Application continues working for 72 hours without server contact.

Solutions:

  1. Check internet connectivity:
    ping api.meshoptixiq.com
    curl https://api.meshoptixiq.com/health
  2. Check DNS resolution:
    nslookup api.meshoptixiq.com
    dig api.meshoptixiq.com
  3. Check firewall: Ensure HTTPS outbound (TCP/443) is allowed to api.meshoptixiq.com
  4. Once connectivity is restored, grace period automatically resets on next validation

CLI Issues

Error: "meshq: command not found"

Cause: Package not installed or not in system PATH.

Solutions:

  1. Verify installation:
    pip list | grep meshoptixiq
  2. Reinstall package:
    cd network_discovery
    pip install -e .
  3. Check PATH:
    which meshq
    echo $PATH

Error: "No data collected from any device"

Causes: SSH connectivity issues, incorrect credentials, or inventory syntax errors.

Solutions:

  1. Verify inventory YAML syntax:
    python -c "import yaml; yaml.safe_load(open('inventory.yaml'))"
  2. Test SSH manually:
    ssh username@device-ip
    # Try the exact credentials from your inventory file
  3. Check credentials: Verify username/password in inventory file match device configuration
  4. Verify firewall rules: Ensure SSH (TCP/22) is allowed from the MeshOptixIQ host to devices
  5. Check device logs: Look for authentication failures in device logs

Error: "Parser failed for device X"

Cause: Unexpected CLI output format from the device.

Solutions:

  1. Check vendor type: Ensure vendor field in inventory matches device type (e.g., cisco_ios, arista_eos)
  2. Check device version: Very old or very new device firmware may have different output formats
  3. Report issue: Contact support@meshoptixiq.com with device type and version

API Issues

Error: 401 Unauthorized

Cause: Missing or invalid API key in the request.

Solutions:

  1. Include API key header:
    curl -H "X-API-Key: your-api-key" \
      https://api.meshoptixiq.com/queries/
  2. Verify API key: Check your API key in the customer dashboard
  3. Check key expiration: API keys may expire based on your license

API server fails to start

Cause: API_KEY environment variable is not set. As of v0.1.0 this variable is required — the server will exit immediately at startup if it is missing.

Solution: Set a strong random value before starting the container:

# Generate a key
openssl rand -hex 32

# Pass it to the container
docker run -e API_KEY=<generated-value> ... meshoptixiq/discovery-agent:latest

For Docker Compose, add it to your .env file and reference it as API_KEY: ${API_KEY}.

Error: 504 Gateway Timeout

Cause: Query too complex or database performance issues.

Solutions:

  1. Simplify query: Reduce scope or use pagination for large result sets
  2. Use pagination:
    GET /queries/list_devices?limit=100&offset=0
  3. Check database performance: Verify Neo4j or PostgreSQL has adequate resources
  4. Contact support: If timeouts persist, contact support@meshoptixiq.com

Docker Issues

Container Status: "Unhealthy"

Cause: Application startup failure or health check failure.

Solutions:

  1. Check container logs:
    docker logs meshoptixiq
    docker logs --tail 100 meshoptixiq
  2. Verify environment variables: Ensure license key and database URI are set correctly
  3. Test health endpoint:
    curl http://localhost:8000/health
  4. Check database connectivity:
    curl http://localhost:8000/health/ready

Error: "Cannot connect to Neo4j"

Cause: Database not accessible from container.

Solutions:

  1. Use correct hostname: From inside Docker, use host.docker.internal instead of localhost:
    docker run -e NEO4J_URI="bolt://host.docker.internal:7687" ...
  2. Check Neo4j is running:
    docker ps | grep neo4j
    # OR
    systemctl status neo4j
  3. Verify Neo4j password: Match NEO4J_PASSWORD env var with actual Neo4j password

Database Issues

Neo4j: High Memory Usage

Cause: Large graph data or inefficient queries.

Solutions:

  1. Increase heap size: Edit neo4j.conf:
    dbms.memory.heap.initial_size=4G
    dbms.memory.heap.max_size=8G
  2. Create indexes: Ensure common query patterns are indexed:
    CREATE INDEX FOR (d:Device) ON (d.hostname);
    CREATE INDEX FOR (i:Interface) ON (i.name);
  3. Archive old data: Consider archiving snapshots older than 90 days

PostgreSQL: Slow Query Performance

Cause: Missing indexes or insufficient resources.

Solutions:

  1. Check query execution plan:
    EXPLAIN ANALYZE SELECT * FROM devices WHERE hostname = 'switch-01';
  2. Ensure indexes exist: Check application logs for index creation confirmations
  3. Increase connection pool: Adjust max_connections in postgresql.conf if needed

Diagnostic Scripts

Comprehensive Health Check

#!/bin/bash
# Save as health-check.sh

echo "=== MeshOptixIQ Health Check ==="
echo ""

# 1. Check license
echo "1. Checking license..."
meshq version || echo "ERROR: CLI not working"
echo ""

# 2. Check database connectivity
echo "2. Checking database..."
curl -s http://localhost:8000/health/ready | jq . || echo "ERROR: API not responding"
echo ""

# 3. Check license server connectivity
echo "3. Checking license server..."
curl -s https://api.meshoptixiq.com/health | jq . || echo "ERROR: Cannot reach license server"
echo ""

# 4. Check Docker containers
echo "4. Checking Docker containers..."
docker ps | grep -E "meshoptixiq|neo4j|postgres"
echo ""

# 5. Check disk space
echo "5. Checking disk space..."
df -h | grep -E "Filesystem|/$"
echo ""

echo "=== Health Check Complete ==="

FAQ

Q: How often should I run discovery?

A: For most networks, daily discovery (via cron or scheduled task) is sufficient. For highly dynamic environments, consider every 6-12 hours.

Q: Can I run multiple discovery agents?

A: Yes, but each installation counts toward your device limit. Pro plan allows 5 installations, Enterprise allows unlimited.

Q: How do I backup my graph data?

A: For Neo4j, use neo4j-admin dump. For PostgreSQL, use pg_dump. See Monitoring & Operations guide for details.

Q: What network permissions does MeshOptixIQ need?

A: Requires SSH (TCP/22) to network devices, HTTPS (TCP/443) to api.meshoptixiq.com for licensing, and connectivity to your graph database.

Getting Help

If you're still experiencing issues after trying these solutions:

Pro Tip: When contacting support, include:

  • Your license tier (Starter/Pro/Enterprise)
  • Operating system and version
  • Complete error messages from logs
  • Output from diagnostic health check script