Latest news about Bitcoin and all cryptocurrencies. Your daily crypto news habit.
The inspiration for this post came from a project at work. I was building a service that required the comparison of two Json objects. The catch was that I needed to be able to replace keys, filter out paths, and apply comparison functions to specificĀ nodes.
Obviously, a standard library comparison function such as reflect.DeepEqual() would not work.Ā š§
The solution was to build a AST(Abstract Syntax Tree) modeled off of the Json objects. Every Node in the tree represents either a string, integer, array, orĀ object.
By doing this I would allow for the flexibility to more easily apply algorithms onto theĀ data.
To build this weāll start with the Lexer to generate Tokens. Then move onto the Parser which will take the tokens and match them to Json grammar. Finally, weāll add AST hooks to generate theĀ tree.
The final directory structure:
.main.go/lexer lexer.go lexer_test.go/token token.go/parser parser.go/ast ast.go
If you want to see and run the finalĀ results:
cd $GOPATH/src/github.com/Lebonescogit clone https://github.com/Lebonesco/json_parser.gitgo run main.go ./examples/test.json
Lexer
The lexerās job is to take in the json data and convert it into a stream of tokens. These tokens include: INVALID, EOF, COMMA, COLON, LBRACE, RBRACE, LBRACKET, RBRACKET, STRING, andĀ INTEGER.
Note: A lexer is also referred to as aĀ scanner.
Letās start down belowĀ š
cd $GOPATH/src/github.com/Lebonesco/json_parsermkdir tokencd tokentouch token.go
You have some freedom in how you want to define your tokens. The more data you add to a token the easier it is toĀ debug.
Note: Weāll be using a rune array, []rune, to store our token literals to allow for Unicode characters.
Next, letās jump into our lexerĀ š
mkdir lexercd lexertouch lexer.gotouch lexer_test.go
The lexer will track where we are in the input and necessary character lookĀ aheads.
In terms of functionality, it will need to be able to create a new token and peak ahead to the nextĀ one.
Note: This scanner doesnāt support Boolean values, but they could easily beĀ added.
Lexer Test
Here we will take in a json string and make sure that it outputs the correct stream ofĀ tokens.
To run theĀ test:
go test -v=== RUN TestLexer--- PASS: TestLexer (0.00s)PASSok github.com/Lebonesco/json_parser/lexer 0.433s
You now have a working lexer š šĀ š
Parser
This is the part where we take our stream and match it with json grammar to produce ASTĀ nodes.
If we were to define json in terms of regular expressions it would be represented by the grammar defined belowĀ š
JSON : valueValue : Object | Array | String | Integer | Bool | NullArray : '[' [Value] {',' Value}']'Object : '{' [Property] {',' Property} '}'Property : String ':' Value
In the above syntax, [expression] means the expression occurs one or more times and {expression} means that it occurs zero or moreĀ times.
There are tools like gocc that will generate a lexer and/or parser if you provide the regular expressions. Thatās the recommended way to go if youāre dealing with something more complicated.
But since Json is fairly simple, weāll do it by hand!Ā š
Letās construct the AST nodes that will represent our finalĀ results.
mkdir astcd asttouch ast.go
The nodes are fairly simple. If we cared more about error handling and tracking node hashes, like in my use case, we could store moreĀ data.
Note: Because Go uses composition instead of inheritance we need to apply the TokenLiteral() method to each node type in order for each to be interpreted as a JsonĀ node.
Now onto theĀ Parser!
Letās bring it all together and write our driver,Ā main.go.
Note: json.MarshalIndent() is a nice substitute to json.Marshal() to get prettier jsonĀ outputs.
To run:
go run main.go ./exampes/test.json
All DoneĀ š
Now that we can generate an AST, itās easy to add a rolling hash to each node and do all of the other custom operations suchĀ as:
- substitute nodeĀ values
- filter outĀ nodes
- apply special comparison functions toĀ nodes
Iāll leave those extensions to the reader. Feel free to drop links to your work in the comments. Iād love to see what you come up with.Ā š
Thank you for taking the time to read thisĀ article.
If you found it helpful or interesting please let me knowĀ ššš.
Create a Go Json Parser: Batteries Included was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.
Disclaimer
The views and opinions expressed in this article are solely those of the authors and do not reflect the views of Bitcoin Insider. Every investment and trading move involves risk - this is especially true for cryptocurrencies given their volatility. We strongly advise our readers to conduct their own research when making a decision.