NOTE: This article presents 3 scripts. Scroll to the section “Full Path Json with Python (Better – version 2)” at the bottom for the best one. See GIST with downloadable script.
Sometimes JSON files are nested so much that it is hard to look for output. Of course, one could use JQ but to be fast with JQ you first need to know the structure of the JSON file. Here is a proposed method to look thru JSON files by using full path json. Which is just a flat 1 layer deep representation of the json object where it looks like this { full-key-path : value output }
. It makes it very easy to grep thru and also understand the full json object.
Note the input requirements are that the json files must be in string format saved to a file. Also they must be valid formats that json would understand. Example: strings must be surrounded by quotes.
So this would fail to be processed:
{
one: 1,
two: {
three: 3
},
four: {
five: 5,
six: {
seven: 7
},
eight: 8
},
nine: 9
}
But this would process:
{
"one": 1,
"two": {
"three": 3
},
"four": {
"five": 5,
"six": {
"seven": 7
},
"eight": 8
},
"nine": 9
}
The output should just be something that is easy to grep for words to get what you are looking for.
Example:
{
one: 1,
'two.three': 3,
'four.five': 5,
'four.six.seven': 7,
'four.eight': 8,
nine: 9
}
# Or
{'four.eight': 8,
'four.five': 5,
'four.six.seven': 7,
'nine': 9,
'one': 1,
'two.three': 3}
# Or
{'four/eight': 8,
'four/five': 5,
'four/six/seven': 7,
'nine': 9,
'one': 1,
'two/three': 3}
As you can see you can easily grep for the word “six” in those outputs to get everything under the key value of “six”. However, in the original json string grepping for “six” would not return anything meaningfull. You would need to use JQ. For JQ you need to be familiar with the layout of the file, however with this full path json you see the layout at a glance.
Note: Below are presented 3 programs: 1 node js program and 2 python program. The worst one is the nodejs version as sometimes it returns objects or errors when the others return the output. The best one is the last python one (fullpathjson2.py). Feel free to use whichever you want though. Scroll to the section “Full Path Json with Python (Better – version 2)” at the bottom for the best one.
Full Path Json With Node.JS
Here is an attempt via node.js
Step 1. Install node.js
Examples:
For Ubuntu / Dabian Linux:
apt install nodejs;
For MAC:
brew install node;
For Centos Linux:
yum install nodejs;
Step 2. Create the following node.js program. Just put this content into a file called fullpathjson.js.
cat fullpathjson.js
/* ================================================================================
function: read in json file from first arg and print out full path json format.
consider this input file:
$ cat test.json
{
"one": 1,
"two": {
"three": 3
},
"four": {
"five": 5,
"six": {
"seven": 7
},
"eight": 8
},
"nine": 9
}
example:
$ node fullpathjson.js test.json
output:
{
one: 1,
'two.three': 3,
'four.five': 5,
'four.six.seven': 7,
'four.eight': 8,
nine: 9
}
================================================================================ */
const filename = process.argv[2];
const fs = require("fs");
fs.readFile(filename, "utf8", (err, jsonString) => {
if (err) {
console.log("Error reading file from disk:", err);
return;
}
try {
const obj = JSON.parse(jsonString);
// console.log(obj); // DEBUG: shows given json object
const flatObject = (obj, keyPrefix = null) =>
Object.entries(obj).reduce((acc, [key, val]) => {
const nextKey = keyPrefix ? `${keyPrefix}.${key}` : key
if (typeof val !== "object") {
return {
...acc,
[nextKey]: val
};
} else {
return {
...acc,
...flatObject(val, nextKey)
};
}
}, {});
console.log(flatObject(obj))
} catch (err) {
console.log("Error parsing JSON string:", err);
}
});
Step 3. Run the program against some json output like so
node fullpathjson.js input.json
The output should be easier to interpret and is very easily greppable.
Here is example output from an fio job which saved its results as json. Then we ran the resulting json file agsint fullpathjson.js.
...skip...
'fio version': 'fio-3.7',
timestamp: 1668015571,
time: 'Wed Nov 9 09:39:31 2022',
'global options.iodepth': '1',
'global options.bs': '32k,100k',
'global options.direct': '1',
'global options.runtime': '180',
'global options.rwmixread': '70',
'global options.rw': 'randrw',
'global options.numjobs': '40',
'client_stats.0.jobname': 'job1',
'client_stats.0.groupid': 0,
'client_stats.0.error': 117,
'client_stats.0.job options.filename': '/mnt/nvme1/testfile.bin:/mnt/nvme2/testfile.bin:/mnt/nvme3/testfile.bin:/mnt/nvme4/testfile.bin:/mnt/nvme5/testfile.bin:/mnt/nv',
'client_stats.0.read.io_bytes': 4423680,
'client_stats.0.read.io_kbytes': 4320,
'client_stats.0.read.bw_bytes': 552960000,
'client_stats.0.read.bw': 540000,
'client_stats.0.read.iops': 20250,
'client_stats.0.read.runtime': 8,
'client_stats.0.read.total_ios': 162,
'client_stats.0.read.short_ios': 0,
'client_stats.0.read.drop_ios': 0,
'client_stats.0.read.slat_ns.min': 0,
'client_stats.0.read.slat_ns.max': 0,
'client_stats.0.read.slat_ns.mean': 0,
'client_stats.0.read.slat_ns.stddev': 0,
'client_stats.0.read.clat_ns.min': 324640,
'client_stats.0.read.clat_ns.max': 1766444,
'client_stats.0.read.clat_ns.mean': 604275.614815,
'client_stats.0.read.clat_ns.stddev': 297093.707537,
'client_stats.0.read.clat_ns.percentile.1.000000': 346112,
'client_stats.0.read.clat_ns.percentile.5.000000': 362496,
'client_stats.0.read.clat_ns.percentile.10.000000': 382976,
'client_stats.0.read.clat_ns.percentile.20.000000': 419840,
...skip...
Full Path Json with Python
Here is an example of full path json using python.
Create the following python script call it fullpathjson.py
import json
import sys
import pprint
def full_path_json(json_string: str) -> dict:
def transform(obj, parent_key='', separator='.'):
items = []
for k, v in obj.items():
new_key = parent_key + separator + k if parent_key else k
if isinstance(v, dict):
items.extend(transform(v, new_key, separator).items())
else:
items.append((new_key, v))
return dict(items)
# Parse the JSON string and transform it
obj = json.loads(json_string)
return transform(obj)
if __name__ == "__main__":
# get filename from arg
filename = sys.argv[1]
# Read the JSON file and transform it
with open(filename, 'r') as f:
json_string = f.read()
# flattened dict
fdict = full_path_json(json_string)
# pretty print
pprint.pprint(fdict)
Here is a small example json file which we can use as the input; its the same one from the fullpathjson.js comment header.
$ cat test.json
{
"one": 1,
"two": {
"three": 3
},
"four": {
"five": 5,
"six": {
"seven": 7
},
"eight": 8
},
"nine": 9
}
Now apply the full path json.
$ python fullpathjson.py test.json
{'four.eight': 8,
'four.five': 5,
'four.six.seven': 7,
'nine': 9,
'one': 1,
'two.three': 3}
The output is slightly different, however, it is still very grep-able and that is the whole goal.
Full Path Json with Python (Better – version 2a)
Script fullpathjson2.py
Note: Download this script from my github GIST as fullpathjson.py
import json
import sys
# import pprint
def flatten_json(nested_json: dict, exclude: list=[], delim: str = "/") -> dict:
"""Flatten json object with nested keys into a single level.
Args:
nested_json {dict}: A nested json object.
exclude {list}: Keys to exclude from output.
delim {str}: path delimiter, default is "/"
Returns:
{dict}: The flattened json object if successful, None otherwise.
"""
out = {}
def flatten(x: dict, name: str='', exclude: list=[]):
if type(x) is dict:
for a in x:
if a not in exclude:
flatten(x[a], f"{name}{delim}{a}", exclude)
elif type(x) is list:
i = 0
for a in x:
flatten(a, f"{name}{delim}{i}", exclude)
i += 1
else:
out[name[1:]] = x
flatten(nested_json, exclude=exclude)
return out
if __name__ == "__main__":
filename = sys.argv[1]
file_content = open(filename,"r").read()
dict_content = json.loads(file_content)
flattened_dict = flatten_json(dict_content)
# pprint.pprint(flattened_dict, width=3000)
for key, value in flattened_dict.items():
print(f'{key}: {value}')
Example:
# Input
$ cat test.json
{
"one": 1,
"two": {
"three": 3
},
"four": {
"five": 5,
"six": {
"seven": 7
},
"eight": 8
},
"nine": 9
}
# Output
$ python fullpathjson2.py test.json
four/eight: 8,
four/five: 5,
four/six/seven: 7,
nine: 9,
one: 1,
two/three: 3
Full Path Json with Python (Older Good version – version 2_old)
You can get more jsonized output by running with pretty print instead of a for loop, which is the old version of the better script, hence why its called 2_old. To do that uncomment the pretty print module import at the top, uncomment the pprint call line in the main section, then comment out the for loop in the main section. You can also download the zOld-fullpathjson.py file from the GIST. The script will look like this.
import json
import sys
import pprint
def flatten_json(nested_json: dict, exclude: list=[], delim: str = "/") -> dict:
"""Flatten json object with nested keys into a single level.
Args:
nested_json {dict}: A nested json object.
exclude {list}: Keys to exclude from output.
delim {str}: path delimiter, default is "/"
Returns:
{dict}: The flattened json object if successful, None otherwise.
"""
out = {}
def flatten(x: dict, name: str='', exclude: list=[]):
if type(x) is dict:
for a in x:
if a not in exclude:
flatten(x[a], f"{name}{delim}{a}", exclude)
elif type(x) is list:
i = 0
for a in x:
flatten(a, f"{name}{delim}{i}", exclude)
i += 1
else:
out[name[1:]] = x
flatten(nested_json, exclude=exclude)
return out
if __name__ == "__main__":
filename = sys.argv[1]
file_content = open(filename,"r").read()
dict_content = json.loads(file_content)
flattened_dict = flatten_json(dict_content)
pprint.pprint(flattened_dict, width=3000)
# for key, value in flattened_dict.items():
# print(f'{key}: {value}')
Its output with pretty print will look like this. Note the keys are surrounded by quotes and the entirety is surrounded by curly braces.
# Input
$ cat test.json
{
"one": 1,
"two": {
"three": 3
},
"four": {
"five": 5,
"six": {
"seven": 7
},
"eight": 8
},
"nine": 9
}
# Output
$ python fullpathjson2.py test.json
{ 'four/eight': 8,
'four/five': 5,
'four/six/seven': 7,
'nine': 9,
'one': 1,
'two/three': 3 }