-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: adds libraries for data processing #28
Open
0x6861746366574
wants to merge
19
commits into
symbol:main
Choose a base branch
from
0x6861746366574:block_lib
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 18 commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
813b9c9
merged updated block extractor library from local misc package
0x6861746366574 1373e98
merged block delegates lib from local misc package
0x6861746366574 7366515
merged block harvester lib from local misc package
0x6861746366574 8564989
merged nember nft lib from local misc package
0x6861746366574 1546048
import fix for block extractor modules
0x6861746366574 0df377c
removed integration test stubs for unit test development
0x6861746366574 dd80570
updated gitignore and README to reflect block library additions
0x6861746366574 c63eef0
added unit tests for module-level functions in process.py
0x6861746366574 e970466
added integration tests for extractor and processor
0x6861746366574 a4d23d3
added large fixtures to enable full extractor integration tests
0x6861746366574 7cde8ea
minor formatting changes and lint cleanup
0x6861746366574 ddce65c
added dev requirements file
0x6861746366574 4aa6ebc
suppress too many locals, branches, statements
262215c
missing packages
eff24ff
fix generators inside any
b82ea1f
missing encodings + silence consider-using-with
479d6b4
disable similarities checker under block dir
3091831
erm, fixed wrong order
8e2f1f9
refactored variable names and argument parsers; added example account…
0x6861746366574 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,7 @@ | ||
*~ | ||
*.pch | ||
*.pyc | ||
*.ipynb | ||
__pycache__/ | ||
.idea | ||
.vscode/ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -51,6 +51,109 @@ Example: check accounts in `account/samples/verify_ownership.yaml`. | |
python -m account.verify_ownership --input account/samples/verify_ownership.yaml | ||
``` | ||
|
||
## block | ||
|
||
Running block extraction scripts requires the installaton of the local **block** package. This can be accomplished as follows: | ||
```sh | ||
pip install -e setup.py | ||
``` | ||
|
||
### extractor/extract | ||
|
||
_extracts chain data from node files and produces compact output for applications_ | ||
|
||
The extractor is the first step in processing raw block data using the block-level scripts, either to drive visualizations or chain analysis. | ||
There are two output types: | ||
- blocks | ||
- statements | ||
|
||
Example: extract data from node files stored in `block/data` | ||
Default output dir is `block/resources` | ||
|
||
```sh | ||
python extractor/extract.py --input data --output resources | ||
``` | ||
gimre-xymcity marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### extractor/process | ||
|
||
_processes extracted chain data to generate useful/readable representation of chain state_ | ||
|
||
The processor streams data output by the extractor and builds human-readable representations of the block headers | ||
as well as a rich, indexable representation of the chain state. | ||
There are two output types: | ||
- block headers | ||
- chain state | ||
|
||
Example: process data from extractor output stored in `block/resources` | ||
Default output dir is `block/resources` | ||
|
||
```sh | ||
python extractor/process.py --input resources --output resources | ||
``` | ||
|
||
### delegates/find_delegates | ||
|
||
_finds current delegates associated with one or more nodes using serialized state data_ | ||
|
||
This script requires a JSON containing accounts similar to what is receieved from the /node/info API endpoint; see example in `resources/accounts.json`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this file is not present (?) |
||
As long as node URLs/names are available it will attempt to get missing information from the nodes. | ||
|
||
Example: find all delegates from nodes listed in `resources/accounts.json` using chain state from `resources/state_map.msgpack`. | ||
Default output dir is `block/delegates/output`. | ||
|
||
```sh | ||
python delegates/find_delegates.py --input resources/accounts.json --state_path resources/state_map.msgpack | ||
``` | ||
|
||
### harvester/get_harvester_stats | ||
|
||
_aggregate harvesting statistics using serialized state data_ | ||
|
||
This script requires a JSON containing harvester addresses; see example in `resources/accounts.json`. | ||
Stats are aggregated for the full chain history and binned based on provided frequencies. | ||
The output falls into three categories: | ||
- blocks harvested | ||
- fees collected | ||
- total XYM balance | ||
|
||
Example: get stats for harvesters listed in `resources/accounts.json` using chain state from `resources/state_map.msgpack` and `resources/block_header_df.pkl` | ||
Default output dir is `block/harvester/output` | ||
|
||
```sh | ||
python harvester/get_harvester_stats.py --input resources/accounts.json --state_path resources/state_map.msgpack --headers_path resources/block_header_df.pkl | ||
``` | ||
|
||
### nft/nember_extract | ||
|
||
_extract transactions corresponding to minting of nember NFTs_ | ||
|
||
Produces two types of output | ||
- NFT descriptions | ||
- transactions involving NFTs after minting | ||
|
||
Example: extract nember data from chain data in `resources/block_data.msgpack` | ||
Default output dir is `block/nft/output` | ||
|
||
```sh | ||
python nft/nember_extract.py --input resources/block_data.msgpack --output nft/output | ||
``` | ||
|
||
### nft/nember_scrape | ||
|
||
_scrape transactions corresponding to minting of nember NFTs from API nodes_ | ||
|
||
Produces two types of output | ||
- NFT descriptions | ||
- transactions involving NFTs after minting | ||
|
||
Example: scrape all transactions corresponding to nember NFTs (takes a couple hours minimum) | ||
Default output dir is `block/nft/output` | ||
|
||
```sh | ||
python nft/nember_scrape.py | ||
``` | ||
|
||
|
||
## health | ||
|
||
### check_nem_balances | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
__all__ = ['extractor', 'extractor.util', 'extractor.state', 'extractor.format'] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
from block.delegates.delegates import find_delegates | ||
|
||
__all__ = ['find_delegates'] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
"""Symbol delegate mapping utilities""" | ||
|
||
from binascii import unhexlify | ||
|
||
import requests | ||
|
||
from block.extractor import public_key_to_address | ||
|
||
|
||
def find_delegates(accounts, state_map): | ||
"""Find current delegates for each node based on chain state at final height""" | ||
|
||
accounts = accounts.copy() | ||
for acc in accounts: | ||
if 'nodePublicKey' in acc: | ||
node_address = public_key_to_address(unhexlify(acc['nodePublicKey'])) | ||
else: | ||
print('No node public key present, trying to collect from API') | ||
try: | ||
node_key = requests.get(f'http://{acc["name"]}:3000/node/info').json()['nodePublicKey'] | ||
node_address = public_key_to_address(unhexlify(node_key)) | ||
except requests.exceptions.ConnectionError: | ||
print(f'Failed to connect, skipping node: {acc["name"]}') | ||
continue | ||
|
||
# initialize delegates with node address | ||
valid_delegates = [acc['address']] | ||
invalid_delegates = [] | ||
|
||
for key, val in state_map.items(): | ||
if node_address in val['node_key_link']: | ||
if val['node_key_link'][node_address][-1][1] == float('inf'): | ||
if sum(val['xym_balance'].values()) >= (10000 * 1e6): | ||
valid_delegates.append(key) | ||
else: | ||
invalid_delegates.append(key) | ||
acc.update({ | ||
'node_address': node_address, | ||
'valid_delegates': valid_delegates, | ||
'invalid_delegates': invalid_delegates | ||
}) | ||
return accounts |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
#!/usr/bin/env python3 | ||
"""Symbol delegate identification script""" | ||
|
||
import argparse | ||
import json | ||
|
||
from block.delegates.delegates import find_delegates | ||
from block.extractor import XYMStateMap | ||
|
||
if __name__ == '__main__': | ||
|
||
parser = argparse.ArgumentParser() | ||
parser.add_argument('--input', type=str, default='resources/accounts.json', help='path to load node information from') | ||
parser.add_argument('--output', type=str, default='delegates/output/node_delegates.json', help='path to write delegates json') | ||
parser.add_argument('--state_path', type=str, default='resources/state_map.msgpack', help='path to load state map from') | ||
|
||
args = parser.parse_args() | ||
|
||
print(f'Reading state from {args.state_path}') | ||
state_map = XYMStateMap.read_msgpack(args.state_path) | ||
|
||
print(f'Reading nodes from {args.input}') | ||
with open(args.input, 'r', encoding='utf8') as f: | ||
accounts = json.loads(f.read())['accounts'] | ||
|
||
print('Identifying delegates . . .') | ||
delegate_accounts = find_delegates(accounts, state_map) | ||
|
||
print(f'All accounts processed, writing output to {args.output}') | ||
with open(args.output, 'w', encoding='utf8') as f: | ||
f.write(json.dumps(delegate_accounts, indent=4)) | ||
|
||
print('Delegate analysis complete!') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
from block.extractor.state import XYMStateMap | ||
from block.extractor.util import encode_address, fmt_unpack, public_key_to_address | ||
|
||
__all__ = [ | ||
'state', | ||
'format', | ||
'util', | ||
'statements', | ||
'body', | ||
'process', | ||
'XYMStateMap', | ||
'fmt_unpack', | ||
'encode_address', | ||
'public_key_to_address' | ||
] |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, it should be possible (and it is already - assuming you pip install all requirement files), to run the tools like
PYTHONPATH=. python3 block/delegates/find_delegates.py