Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Data-Prepper otel_logs_source: Incorrect OTLP Log Structure Mapping #5455

Open
rafael-gumiero opened this issue Feb 21, 2025 · 2 comments

Comments

@rafael-gumiero
Copy link
Contributor

Bug Description

The Data-Prepper's otel_logs_source pipeline is generating documents that don't follow the OpenTelemetry log data model specification. The current implementation incorrectly maps fields and doesn't maintain the standard OTEL log record structure.

Current Behavior

Currently, the Data Prepper pipeline generates documents with this incorrect structure:

{
  "_index": "otel-logs-osi-classic",
  "_id": "3ewMKZUBlpG2jc_nugaw",
  "_source": {
    "traceId": "",
    "spanId": "",
    "severityText": "INFO",
    "flags": 0,
    "time": "2025-02-21T15:07:20.691914Z",
    "severityNumber": 9,
    "droppedAttributesCount": 1,
    "serviceName": null,
    "body": "the message",
    "observedTime": "1970-01-01T00:00:00Z",
    "schemaUrl": "",
    "log.attributes.app": "server"
  },
  "fields": {
    "observedTime": [
      "1970-01-01T00:00:00.000Z"
    ],
    "time": [
      "2025-02-21T15:07:20.691Z"
    ]
  }
}

Expected Behavior

The OpenTelemetry Collector's OpenSearch Exporter already implements the correct structure, following both the OTEL specification and ss4o. Here's an example of the same log being correctly formatted by the OpenSearch Exporter:

{
  "_index": "teste-logs-osi",
  "_id": "LzYMKZUBRGZ4BbSsiBQY",
  "_source": {
    "attributes": {
      "app": "server",
      "data_stream": {
        "dataset": "default",
        "namespace": "namespace",
        "type": "record"
      }
    },
    "body": "the message",
    "instrumentationScope": {},
    "observedTimestamp": "2025-02-21T15:07:22.004464639Z",
    "severity": {
      "text": "INFO",
      "number": 9
    },
    "@timestamp": "2025-02-21T15:07:20.691914Z"
  }
}

Key differences and issues:

  1. Field Names:

    • Wrong: "time" → Should be: "timestamp"
    • Wrong: "observedTime" → Should be: "observedTimestamp"
  2. Structure Issues:

    • Resource attributes are flattened instead of being nested under "resource"
    • Instrumentation scope is missing
    • Attributes are incorrectly prefixed with "log.attributes." instead of being nested
    • Invalid observed_timestamp defaulting to epoch
  3. Missing Standard Fields:

    • Proper resource context
    • Instrumentation scope information
    • Structured attribute mapping

Steps to Reproduce

  1. Configure Data Prepper with an otel_logs_source pipeline
  2. Send OTLP format logs through OpenTelemetry Collector
  3. Examine the resulting documents in OpenSearch
  4. Compare with the OTEL log data model specification

Pipeline Configuration

version: "2"
otel-logs-pipeline:
  source:
    otel_logs_source:
      # Provide the path for ingestion. ${pipelineName} will be replaced with sub-pipeline name, i.e. otel-logs-pipeline, configured for this pipeline.
      # In this case it would be "/otel-logs-pipeline/v1/logs".
      path: "/${pipelineName}/v1/logs"
  sink:
    - opensearch:
        # Provide an AWS OpenSearch Service domain endpoint
        hosts: [ "https://XXXXX.us-east-1.es.amazonaws.com" ]
        aws:
          # Provide a Role ARN with access to the domain. This role should have a trust relationship with osis-pipelines.amazonaws.com
          sts_role_arn: "arn:aws:iam::XXXXXX:role/role-osi-pipeline-otel-logs-aos"
          # Provide the region of the domain.
          region: "us-east-1"
          # Enable the 'serverless' flag if the sink is an Amazon OpenSearch Serverless collection
          serverless: false
          # serverless_options:
            # Specify a name here to create or update network policy for the serverless collection
            # network_policy_name: "network-policy-name"
        index: "otel-logs-osi"
        # Enable the 'distribution_version' setting if the AWS OpenSearch Service domain is of version Elasticsearch 6.x
        # distribution_version: "es6"
        # Enable and switch the 'enable_request_compression' flag if the default compression setting is changed in the domain. See https://docs.aws.amazon.com/opensearch-service/latest/developerguide/gzip.html
        # enable_request_compression: true/false
        # Optional: Enable the S3 DLQ to capture any failed requests in an S3 bucket. Delete this entire block if you don't want a DLQ.
        dlq:
          s3:
            # Provide an S3 bucket
            bucket: "aws-s3-XXXX"
            # Provide a key path prefix for the failed requests
            key_path_prefix: "otel-logs-pipeline/logs/dlq"
            # Provide the region of the bucket.
            region: "us-east-1"
            # Provide a Role ARN with access to the bucket. This role should have a trust relationship with osis-pipelines.amazonaws.com
            sts_role_arn: "arn:aws:iam::XXXXX:role/role-osi-pipeline-otel-logs-aos"

Environment

  • Data Prepper Version: [2.10.2] / OSI
  • OpenSearch Version: [2.17] AOS
  • OpenTelemetry Collector Version: [v0.120.0]

Impact

  1. Inconsistency with OTEL specification
  2. Difficult to implement standard log queries
  3. Poor integration with OTEL ecosystem
  4. Compromised observability capabilities
  5. Extra development effort for workarounds

Questions

  1. Is there a plan to align the implementation with the OTEL specification or ss4o?
  2. Can custom processors be used to transform the data into the correct format?
@KarstenSchnitter
Copy link
Collaborator

Thanks for providing this nice issue. You will find, there is a lot of discussion in that direction in #5259. The issue extends to all OpenTelemetry documents generated by Data Prepper. Your focus is mainly on the data format in the _source field, while the current mapping in Data Prepper tries to avoid conflicts in indexing as much as possible. See #5259 for an explanation and the current state of discussion.

Note, that the naming conventions for OpenTelemetry logs and all other signals (metrics and traces) is taken directly from the protobuf specification. You will find some deviations of that definitions from the OpenTelemetry specifications you linked.

@rafael-gumiero
Copy link
Contributor Author

@KarstenSchnitter Thank you for the heads-up! I believe we've reached a similar conclusion that adopting a simple schema makes sense to deliver something ready for the user. I wouldn't want to add an additional transformation step for this purpose.

Currently, we have other observability vendors, and we need to deliver the same experience from a consumption perspective (Example: Dynatrace delivers data in the same way as ss4o/OpenSearch exporter).

For our scenario, I believe it makes sense for short-time to proceed with the OpenSearch exporter to avoid adding cognitive load to the team.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

2 participants