feat: Add a parallel JSON log by default by favilo · Pull Request #1892 · elastic/rally

favilo · 2024-11-14T18:46:07Z

For ease of programmatic parsing, and uploading to Kibana with filebeat

Have you signed the contributor license agreement?
Have you followed the contributor guidelines?
Have you run make check-all successfully?
Did you choose a descriptive title and description for your PR?
(Only for maintainers) Did you apply appropriate labels and a milestone?

gbanasiak · 2024-11-18T12:59:15Z

Need to merge #1890 in for linter to pass.

gbanasiak

I did a quick functional test and I'm not getting timestamp field in the logs:

% esrally list races

% cat ~/.rally/logs/rally.log 
2024-11-18 13:12:31,530 -not-actor-/PID:16574 esrally.rally INFO OS [uname_result(system='Darwin', node='pc-913.home', release='23.6.0', version='Darwin Kernel Version 23.6.0: Thu Sep 12 23:35:10 PDT 2024; root:xnu-10063.141.1.701.1~1/RELEASE_ARM64_T6030', machine='arm64')]
2024-11-18 13:12:31,530 -not-actor-/PID:16574 esrally.rally INFO Python [namespace(name='cpython', cache_tag='cpython-312', version=sys.version_info(major=3, minor=12, micro=2, releaselevel='final', serial=0), hexversion=51118832, _multiarch='darwin')]
2024-11-18 13:12:31,557 -not-actor-/PID:16574 esrally.rally INFO Rally version [2.11.1.dev0 (git revision: 066bb7587468b4bef74b3b5978042d1dfd074d97)]
2024-11-18 13:12:31,557 -not-actor-/PID:16574 esrally.utils.net INFO Connecting directly to the Internet (no proxy support) for [all_proxy].
2024-11-18 13:12:31,557 -not-actor-/PID:16574 esrally.utils.net INFO Connecting directly to the Internet (no proxy support) for [all_proxy].
2024-11-18 13:12:31,557 -not-actor-/PID:16574 esrally.rally INFO Cleaning track dependency directory [/Users/grzegorz/.rally/libs]...
2024-11-18 13:12:31,557 -not-actor-/PID:16574 esrally.metrics INFO Creating file race store

% cat ~/.rally/logs/rally.json
{"message": "OS [uname_result(system='Darwin', node='pc-913.home', release='23.6.0', version='Darwin Kernel Version 23.6.0: Thu Sep 12 23:35:10 PDT 2024; root:xnu-10063.141.1.701.1~1/RELEASE_ARM64_T6030', machine='arm64')]", "taskName": null, "actorAddress": "-not-actor-"}
{"message": "Python [namespace(name='cpython', cache_tag='cpython-312', version=sys.version_info(major=3, minor=12, micro=2, releaselevel='final', serial=0), hexversion=51118832, _multiarch='darwin')]", "taskName": null, "actorAddress": "-not-actor-"}
{"message": "Rally version [2.11.1.dev0 (git revision: 066bb7587468b4bef74b3b5978042d1dfd074d97)]", "taskName": null, "actorAddress": "-not-actor-"}
{"message": "Connecting directly to the Internet (no proxy support) for [all_proxy].", "taskName": null, "actorAddress": "-not-actor-"}
{"message": "Connecting directly to the Internet (no proxy support) for [all_proxy].", "taskName": null, "actorAddress": "-not-actor-"}
{"message": "Cleaning track dependency directory [/Users/grzegorz/.rally/libs]...", "taskName": null, "actorAddress": "-not-actor-"}
{"message": "Creating file race store", "taskName": null, "actorAddress": "-not-actor-"}

% python --version
Python 3.12.2

% pip list | grep python-json
python-json-logger            2.0.7

gbanasiak · 2024-11-18T13:20:52Z

Let's also update https://github.com/elastic/rally/blob/master/docs/configuration.rst#logging, second paragraph.

favilo · 2024-11-18T19:26:47Z

I did a quick functional test and I'm not getting timestamp field in the logs:

Care to give it a try again @gbanasiak? I've added the timestamp field to the LogRecord. It will correspond to the %(asctime)s,%(msecs)d that we've been using for the other formatters. And I've added the ability to specify timezone

gbanasiak · 2024-11-19T11:50:33Z

That's better, now I'm getting this after formatting:

{
    "timestamp": "2024-11-19 11:09:55,605074",
    "process": 12826,
    "name": "esrally.driver.driver",
    "levelname": "INFO",
    "message": "Worker[0] finished executing tasks ['sort_country_code_no_can_match_shortcut'] in 0.020128 seconds",
    "taskName": "Task-145",
    "actorAddress": "ActorAddr-(T|:51749)",
    "timezone": "UTC"
}

There's a new taskName log attribute injected by Async I/O which is not included in plain logs. Nice.

To make this easier to ingest by Filebeat / Elastic Agent we could change timestamp to @timestamp and adjust format to be ISO compatible, otherwise we would get error in ES:

failed to parse date field [2024-11-19 11:09:55,605074] with format [strict_date_optional_time||epoch_millis]

This could be addressed by proper Filebeat configuration too.

I was thinking about taking this one step further and modifying log attributes to be ECS compliant (also ECS fields in Filebeat docs). Perhaps this could be done with object translator? But I don't find obvious candidates for some of the fields:

timestamp -> @timestamp
process -> process.pid
name -> ?
levelname -> log.level
message -> message
taskName -> ?
actorAddress -> ?
timezone -> not needed if @timestamp properly formatted

docs/configuration.rst

gbanasiak · 2024-11-19T15:58:08Z

I was thinking about taking this one step further and modifying log attributes to be ECS compliant [..]

We have to change process field name to something different due to collision with Filebeat mappings. The fields that have no good candidates could be placed under rally top-level key.

favilo · 2024-11-20T03:35:06Z

I've used the ecs-logging python library to actually implement on-spec JSON logging. This is much better than my adhoc JSON formatter from before.

I've also subclassed the formatter in order to move the thespian log fields into their own rally.actor section.

gbanasiak · 2024-11-20T09:42:43Z

Example result:

{
  "@timestamp": "2024-11-20T09:39:43.312Z",
  "log.level": "warning",
  "message": "Invoking a scroll search with the 'search' operation is deprecated and will be removed in a future release. Use 'scroll-search' instead.",
  "ecs.version": "1.6.0",
  "log": {
    "logger": "esrally.driver.runner",
    "origin": {
      "file": {
        "line": 1214,
        "name": "runner.py"
      },
      "function": "__call__"
    },
    "original": "Invoking a scroll search with the 'search' operation is deprecated and will be removed in a future release. Use 'scroll-search' instead."
  },
  "process": {
    "name": "Actor_Worker__ActorAddr-LocalAddr.1",
    "pid": 30676,
    "thread": {
      "id": 130507456198208,
      "name": "ThreadPoolExecutor-0_0"
    }
  },
  "rally": {
    "actor": {
      "address": "ActorAddr-(T|:45685)"
    }
  }
}

gbanasiak

Great idea to use ecs-logging. I did another round of e2e tests and found no collisions when ingesting with Filebeat.

Can we drop log.original entirely? It's useful when documents are processed with ES ingest pipeline, but I don't think we need this (at least not yet) as message field has exactly the same content.

I left some comments and questions.

gbanasiak · 2024-11-20T09:57:48Z

esrally/log.py

+        if log_record.get("actorAddress"):
+            actor["address"] = log_record.pop("actorAddress")
+        if log_record.get("taskName"):
+            actor["task"] = log_record.pop("taskName")
+        if actor:
+            collections.deep_update(log_record, {"rally": {"actor": actor}})


I don't see rally.actor.task populated ever, so I think taskName is lost. Once this is explained we need to think about the right placement for this info. As per docs it should correspond to asyncio.Task name so this is not related to Thespian actors. I would put it outside of rally.actor, rally.asyncio maybe? But then maybe not rally.actor but rather rally.thespian? Ugh, naming...

What's the rationale for the new collections.deep_update() method? Couldn't we simply set rally key in log record dict? It's a new top-level element, so I think that's safe?

Or we can just drop taskName for now.

Changed in version 3.12: taskName was added

Huh, turns out that's fairly new.

We can add %(taskName)s to the format string we pass to the Formatter.

Huh, turns out that's fairly new.

Ah, yes, most recently I've tested with 3.11.7 from esbench env. After switching to 3.12.2 and:

diff --git a/esrally/resources/logging.json b/esrally/resources/logging.json index c1e4b6da..7651ebc5 100644 --- a/esrally/resources/logging.json +++ b/esrally/resources/logging.json @@ -2,12 +2,12 @@ "version": 1, "formatters": { "normal": { - "format": "%(asctime)s,%(msecs)d %(actorAddress)s/PID:%(process)d %(name)s %(levelname)s %(message)s", + "format": "%(asctime)s,%(msecs)d %(actorAddress)s/PID:%(process)d %(name)s %(taskName)s %(levelname)s %(message)s", "datefmt": "%Y-%m-%d %H:%M:%S", "()": "esrally.log.configure_utc_formatter" }, "json": { - "format": "%(message)s", + "format": "%(message)s %(taskName)s", "()": "esrally.log.configure_ecs_formatter" }, "profile": {

I can see taskName populated in rally.log but not in rally.json. I think it gets swallowed in here. We could override format_to_ecs() but I'm also fine just skipping it for now. We can revisit once we find a use for this information. What we're adding in process object is already very useful.

Still interested in hearing the rationale behind deep_update() :)

deep_update is so that just in case we have something already there, like someone adds a future mutator, we don't clobber any dict that is already defined. Like would happen with dict.update()

docs/configuration.rst

favilo · 2024-11-20T16:06:01Z

Can we drop log.original entirely?

Absolutely, I'm just going to drop that in logging.json, by adding an exclude_fields field. That way, of someone wants it back, they can just edit the logging.json file

For ease of programmatic parsing, and uploading to Kibana with filebeat

These take the log record, and add additional fields to it. This is configurable via the `mutators` property in `~/.rally/logging.json`.

favilo · 2024-12-10T21:07:13Z

Okay!

It's been a while since I remembered this, but I've added the ability to change mutators and such.

The current format:

  "@timestamp":"2024-12-10T20:26:20.561Z",
  "log.level":"info",
  "message":"Hello world",
  "ecs.version":"1.6.0",
  "log":{"logger":"esrally.rally","origin":{"file":{"line":1238,"name":"rally.py"},"function":"test"}},
  "process":{"name":"MainProcess","pid":3061691,"thread":{"id":140008672693120,"name":"MainThread"}},
  "python":{"asyncio":{"task":"Task-1"},
  "thespian":{"address":"-not-actor-"}}}

Note: python.asyncio.task only shows up if it is running inside an async task. But it does actually show up now. ecs_logging was in fact eating that

Removed old stuff, and pulled out MutatorType to make the RallyEcsFormatter definition shorter

gbanasiak

Many thanks for adding this. LGTM. See remark regarding protected access.

esrally/log.py

@gbanasiak

This fixes a lint error that we got from trying to use a protected method. Thanks @gbanasiak for the catch.

favilo requested a review from a team November 14, 2024 18:46

gbanasiak reviewed Nov 18, 2024

View reviewed changes

favilo force-pushed the rally-json-logs branch from 066bb75 to 378d078 Compare November 18, 2024 17:46

favilo force-pushed the rally-json-logs branch from 32ebe9c to d70823e Compare November 18, 2024 19:53

gbanasiak reviewed Nov 19, 2024

View reviewed changes

docs/configuration.rst Outdated Show resolved Hide resolved

gbanasiak reviewed Nov 20, 2024

View reviewed changes

favilo added 3 commits December 10, 2024 13:02

feat: Add a parallel JSON log by default

edd9902

For ease of programmatic parsing, and uploading to Kibana with filebeat

docs: Mention rally.json in docs

ea8a579

feat: Add ability to configure ECS JSON logging mutators

14abad5

These take the log record, and add additional fields to it. This is configurable via the `mutators` property in `~/.rally/logging.json`.

favilo force-pushed the rally-json-logs branch from 5959cfb to 14abad5 Compare December 10, 2024 21:03

ci: Lint fixes

e6c000a

Removed old stuff, and pulled out MutatorType to make the RallyEcsFormatter definition shorter

gbanasiak approved these changes Dec 11, 2024

View reviewed changes

esrally/log.py Outdated Show resolved Hide resolved

fix: Silly me overriding the wrong method.

39a56a9

This fixes a lint error that we got from trying to use a protected method. Thanks @gbanasiak for the catch.

favilo merged commit 1c412a9 into elastic:master Jan 2, 2025

favilo deleted the rally-json-logs branch January 2, 2025 23:36

dpifke-elastic added the enhancement Improves the status quo label Mar 28, 2025

dpifke-elastic added this to the 2.12.0 milestone Mar 28, 2025

Conversation

favilo commented Nov 14, 2024

Uh oh!

gbanasiak commented Nov 18, 2024

Uh oh!

gbanasiak left a comment

Choose a reason for hiding this comment

Uh oh!

gbanasiak commented Nov 18, 2024

Uh oh!

favilo commented Nov 18, 2024

Uh oh!

gbanasiak commented Nov 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

gbanasiak commented Nov 19, 2024

Uh oh!

favilo commented Nov 20, 2024

Uh oh!

gbanasiak commented Nov 20, 2024

Uh oh!

gbanasiak left a comment

Choose a reason for hiding this comment

Uh oh!

gbanasiak Nov 20, 2024

Choose a reason for hiding this comment

Uh oh!

gbanasiak Nov 20, 2024

Choose a reason for hiding this comment

Uh oh!

favilo Nov 20, 2024

Choose a reason for hiding this comment

Uh oh!

gbanasiak Nov 21, 2024

Choose a reason for hiding this comment

Uh oh!

favilo Dec 10, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

favilo commented Nov 20, 2024

Uh oh!

favilo commented Dec 10, 2024

Uh oh!

gbanasiak left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gbanasiak commented Nov 19, 2024 •

edited

Loading