Examining pico-engine log files

Introduction

The UI Logging tab shows information about each event and query your pico has received within the last 12 hours. But what if you want to go further back in time? This page will show you a way to do this, using a Bash command and some KRL code.

The pico engine maintains twelve weeks of complete logs in the file system. For example, from one of my pico engines, in the pico engine home directory:

$ ls -lrt pico-engine.log*
-rw-r--r--  1 bconrad  staff     78567 Aug 30 09:57 pico-engine.log.11
-rw-r--r--  1 bconrad  staff      1750 Sep  5 14:52 pico-engine.log.10
-rw-r--r--  1 bconrad  staff      1830 Sep 12 09:36 pico-engine.log.9
-rw-r--r--  1 bconrad  staff    539114 Sep 17 16:26 pico-engine.log.8
-rw-r--r--  1 bconrad  staff    892188 Nov 20 14:55 pico-engine.log.7
-rw-r--r--  1 bconrad  staff    711794 Nov 30 19:15 pico-engine.log.6
-rw-r--r--  1 bconrad  staff  22763439 Dec  7 10:18 pico-engine.log.5
-rw-r--r--  1 bconrad  staff     65895 Dec 14 09:31 pico-engine.log.4
-rw-r--r--  1 bconrad  staff    326553 Jan  8 16:01 pico-engine.log.3
-rw-r--r--  1 bconrad  staff  13266393 Jan 24 08:40 pico-engine.log.2
-rw-r--r--  1 bconrad  staff    213635 Jan 25 06:41 pico-engine.log.1
-rw-r--r--  1 bconrad  staff  88551614 Feb  1 08:44 pico-engine.log.0
-rw-r--r--  1 bconrad  staff   1013541 Feb  8 09:06 pico-engine.log

Notice that the logs automatically rotate, about once a week, depending on usage.

Obtaining the entries for a single pico

In the pico-engine folder, you can run a Bash command like this one to extract all of the log entries for a specific pico, using its ID (available in the About tab of the UI):

grep -h '"pico_id":"cjr9g6agx000wiiphj7uwmqur"' pico-engine.log* >public/my_logs.txt

Here the pico we are interested in has the ID cjr9g6agx000wiiphj7uwmqur so we will get all of the log entries for events and queries delivered to this pico. They will not be in order, so we will be sorting them in the KRL code shown below. If you want to just look at the raw entries, and want them in order, then you could use this code instead:

$ for f in `ls -rt pico-engine.log*`
> do
> grep '"pico_id":"cjr9g6agx000wiiphj7uwmqur"' $f
> done >public/my_logs.txt

This will consider the log files in chronological order and direct matching lines into a temporary file, here named my_logs.txt but you can use any name you like (except for the names of files already in the public folder! (you don't want to over-write one of the UI files)).

Parsing the log entries

This can be done in KRL, using a function like this one:

    episodes = function(url){
      http:get(url){"content"}
        .split(re#\r?\n#)
        .filter(function(x){x})
        .map(function(x){x.decode()})
        .sort(function(a,b){a{"time"} cmp b{"time"}})
        .collect(function(x){x{["context","txn_id"]}})
        .map(episode)
        .values()
    }

providing it with a URL of the file you extracted from the logs in the previous step. This has to be a URL using the http (or https) protocol. By placing the extracted file in our engine's public directory, we can use our engine to serve it up. The URL might look like http://localhost:8080/my_logs.txt if your engine is running on your local machine on port 8080.

I used the UI Testing tab to invoke the function, but you could use a /sky/cloud URL with curl or some other mechanism to invoke a function on a pico in which you have installed a ruleset with this episodes function.

In line 2, the function uses HTTP to get the content of the extracted file. It is a sequence of lines, so the String operator split() breaks it into an array of string, one per log entry (line 3), and the Array operator filter() keeps only non-empty entries (line 4).

Each log entry is a string which encodes a JSON map, so we use the Array operator map() to apply the String operator decode() to each line of the extracted file (line 5).

Line 6 uses the Array operator sort() to put the entries into chronological order.

Each event or query processed by the pico has a unique transaction ID, so in line 7 we use the Array operator collect() to make a map from transaction ID to the array of entries for this event or query. At this point, the log entries are in order and are grouped by event or query, so it would already be useful.

We take the further step of adjusting the transactions into a more familiar format by using the Map operator map() (in line 8) to apply the episode function to each transaction (an entry in a map whose key is the transaction ID and whose value is an array of full log entries). Finally we keep the formatted entries as an array using the Map operator values() in line 9.

Formatting each episode

Each event or query has its own transaction ID and its own array of log entries. We are generally interested only in the timestamp and message of each log entry. Furthermore, rather than using the transaction ID as the key, we would prefer to use the timestamp and the event or query description as the key. The episode function meets this specification.

    episode = function(log_entries,key){
      {}.put(episode_key(log_entries,key),log_entries.map(episode_line))
    }

This function will be applied by the Map operator map() to each of the episodes collected earlier. Called with the array of log entries and the key, it will produce a new map (using the Map operator put() in line 2) with one entry, whose key is computed by the episode_key function and whose value is an array of formatted strings, each produced from one of the log entries, as computed by the episode_line function.

The expanded key is computed by the episode_key function:

    episode_key = function(log_entries,k){
      first_log_entry = log_entries[0];
      context = first_log_entry{"context"};
      event = context{"event"};
      query = context{"query"};
      descriptor = event => <<event/#{context{"eci"}}/#{event{"eid"}}/#{event{"domain"}}/#{event{"type"}}>>
                          | <<query/#{context{"eci"}}/#{query{"rid"}}/#{query{"name"}}>>;
      first_log_entry{"time"} + " - " + descriptor
    }

The array of log entries associated with the transaction ID is changed to an array of strings, each containing the entry timestamp and message, as computed by the episode_line function:

    episode_line = function(x){
      msg = x{"msg"};
      x{"time"} + " " +
      (msg.match(re#^\[#) => msg | "[DEBUG] " + msg)
    }

Sample output

A couple of the episodes for the sample pico, taken from the output of the episodes function, shown here:

[
  ...
  {
    "2019-01-25T13:40:44.366Z - query/PUUJY5enk338GpoYSnsfb6/name/__testing": [
      "2019-01-25T13:40:44.366Z [DEBUG] query received: name/__testing",
      "2019-01-25T13:40:44.366Z [DEBUG] query added to pico queue: cjr9g6agx000wiiphj7uwmqur",
      "2019-01-25T13:40:44.369Z [DEBUG] {\"queries\":[{\"name\":\"__testing\"},{\"name\":\"name\",\"args\":[\"name\"]}],...}"
    ]
  },
  {
    "2019-01-25T13:40:47.934Z - event/PUUJY5enk338GpoYSnsfb6/__testing/name/name": [
      "2019-01-25T13:40:47.934Z [DEBUG] event received: name/name",
      "2019-01-25T13:40:47.935Z [DEBUG] event added to pico queue: cjr9g6agx000wiiphj7uwmqur",
      "2019-01-25T13:40:47.936Z [DEBUG] event being processed",
      "2019-01-25T13:40:47.943Z [DEBUG] rule added to schedule: name -> name",
      "2019-01-25T13:40:47.943Z [DEBUG] rule selected: name -> name",
      "2019-01-25T13:40:47.944Z [DEBUG] name \"[Function]\"",
      "2019-01-25T13:40:47.944Z [DEBUG] fired",
      "2019-01-25T13:40:47.944Z [DEBUG] event finished processing",
      "2019-01-25T13:40:47.945Z [DEBUG] {\"directives\":[{\"options\":{\"name\":\"name\"},\"name\":\"name\",\"meta\":{...}}]}"
    ]
  },
  ...
]

Sample ruleset

A ruleset containing these functions is shown here:

ruleset log_formatting {
  meta {
    shares __testing, episodes
  }
  global {
    __testing = { "queries":
      [ { "name": "__testing" }
      , { "name": "episodes", "args": [ "url" ] }
      ]
    }
    episode_key = function(log_entries,k){
      first_log_entry = log_entries[0];
      context = first_log_entry{"context"};
      event = context{"event"};
      query = context{"query"};
      episode_descriptor = event => <<event/#{context{"eci"}}/#{event{"eid"}}/#{event{"domain"}}/#{event{"type"}}>>
                                  | <<query/#{context{"eci"}}/#{query{"rid"}}/#{query{"name"}}>>;
      first_log_entry{"time"} + " - " + episode_descriptor
    }
    episode_line = function(x){
      msg = x{"msg"};
      x{"time"} + " " +
      (msg.match(re#^\[#) => msg | "[DEBUG] " + msg)
    }
    episode = function(log_entries,key){
      {}.put(episode_key(log_entries,key),log_entries.map(episode_line))
    }
    episodes = function(url){
      http:get(url){"content"}
        .split(re#\r?\n#)
        .filter(function(x){x})
        .map(function(x){x.decode()})
        .sort(function(a,b){a{"time"} cmp b{"time"}})
        .collect(function(x){x{["context","txn_id"]}})
        .map(episode)
        .values()
    }
  }
}

Copyright Picolabs | Licensed under Creative Commons.