You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are several parsing methods that I might test or adjust over time; it might be nice to allow selectability of which one to use, either for accuracy or speed/memory performance
The text was updated successfully, but these errors were encountered:
classS3LogParser:
def__init__(parsed_folder_path: DirectoryPath, s3_log_file_path: FilePath|None=None, s3_log_folder_path: DirectoryPath|None=None):
# assert XOR on paths options# if file, then parse single file according to rules of this class# if folder, then iterate directory structure according to rules of this classdef_parse_line(line: str) ->FullLog|None:
# Parse a single line of a single log filepassdef_parse_lines(lines) ->list[FullLog]:
# Read in and parse all lines (in buffered style) from a single log filedef_reduce_elements(
elements: list[str] = ["timestamps", "asset_id", "remote_ip", "bytes_sent"] # Though actually a constrained literal over all possible 20+ fields
) ->list[ReducedLog]: # Though what constitutes a 'reduced log' type might change from class to class then...# Probably via __init__, control which subfields of an S3 log we which to reduce our parsed output to containdef_iterate_directory(s3_log_folder_path: DirectoryPath):
# The rules for iterating directories; might need some inference on if it's a base/year/month level# Natsort did not work out of the box on the base
There are several parsing methods that I might test or adjust over time; it might be nice to allow selectability of which one to use, either for accuracy or speed/memory performance
The text was updated successfully, but these errors were encountered: